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SERINE PROTEASES, NUCLEIC ACIDS 
s ENCODING SERINE ENZYMES 

AND VECTORS AND HOST CELLS INCORPORATING SAME 

The present application claims priority under 35 U.S.C. §119, to co-pending U.S. 
10 Provisional Patent Application Serial Number 60/523,609, filed November19, 2003. 

FIELD OF THE INVENTION 

The present invention provides novel serine proteases, novel genetic material 
encoding these enzymes, and proteolytic proteins obtained from Micrococcineae spp., 

is including but not limited to Cellulomonas spp. and variant proteins developed therefrom. In 
particular, the present invention provides protease compositions obtained from a 
Cellulomonas spp, DNA encoding the protease, vectors comprising the DNA encoding the 
protease, host cells transformed with the vector DNA, and an enzyme produced by the host 
cells. The present invention also provides cleaning compositions (e.g., detergent 

20 compositions), animal feed compositions, and textile and leather processing compositions 
comprising protease(s) obtained from a Micrococcineae spp., including but not limited to 
Cellulomonas spp. In alternative embodiments, the present invention provides mutant (i.e., 
variant) proteases derived from the wild-type proteases described herein. These mutant 
proteases also find use in numerous applications. 

25 

BACKGROUND OF THE INVENTION 

Serine proteases are a subgroup of carbonyl hydrolases comprising a diverse class 
of enzymes having a wide range of specificities and biological functions (See e.g., Stroud, 
Sci. Amer., 131:74-88). Despite their functional diversity, the catalytic machinery of serine 

30 proteases has been approached by at least two geneticaHy distinct families of enzymes: 1) 
the subtilisins; and 2) the mammalian chymotrypsin-related and homologous bacterial serine 
proteases (e.g., trypsin and S. griseus trypsin). These two families of serine proteases 
show remarkably similar mechanisms of catalysis (See e.g., Kraut, Ann. Rev. Biochem., 
46:331-358 [1977]). Furthermore, although the primary structure is unrelated, the tertiary 

35 structure of these two enzyme families brings together a conserved catalytic triad of amino 
acids consisting of serine, histidine and aspartate. The subtilisins and chymotrypsin-related 
serine proteases both have a catalytic triad comprising aspartate, histidine and serine. In 
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the subtilisin-related proteases the relative order of these amino acids, reading from the 
amino to carboxy terminus, is aspartate-histidine-serine. However, in the chymotrypsin- 
related proteases, the relative order is histidine-aspartate-serine. Much research has been 
conducted on the subtilisins, due largely to their usefulness in cleaning and feed 
applications. Additional work has been focused on the adverse environmental conditions 
(e.g., exposure to oxidative agents, chelating agents, extremes of temperature and/or pH) 
which can adversely impact the functionality of these enzymes in various applications. 
Nonetheless, there remains a need in the art for enzyme systems that are able to resist 
these adverse conditions and retain or have improved activity over those currently known in 
the art. 

SUMMARY OF THE INVENTION 

The present invention provides novel serine proteases, novel genetic material 
encoding these enzymes, and proteolytic proteins obtained from Micrococcineae spp., 
including but not limited to Cellulomonas spp. and variant proteins developed therefrom. In 
particular, the present invention provides protease compositions obtained from a 
Cellulomonas spp, DNA encoding the protease, vectors comprising the DNA encoding the 
protease, host cells transformed with the vector DNA, and an enzyme produced by the host 
cells. The present invention also provides cleaning compositions (e.g., detergent 
compositions), animal feed compositions, and textile and leather processing compositions 
comprising protease(s) obtained from a Micrococcineae spp., including but not limited to 
Cellulomonas spp. In alternative embodiments, the present invention provides mutant (i.e., 
variant) proteases derived from the wild-type proteases described herein. These mutant 
proteases also find use in numerous applications. 

The present invention provides isolated serine proteases obtained from a member of 
the Micrococcineae. In some embodiments, the proteases are cellulomonadins. In some 
preferred embodiments, the protease is obtained from an organism selected from the group 
consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and 
Promicromonospora. In some particularly preferred embodiments, the protease is obtained 
from Cellulomonas 69B4. In further embodiments, the protease comprises the amino acid 
sequence set forth in SEQ ID NO:8. In additional embodiments, the present invention 
provides isolated serine proteases comprising at least 45% amino acid identity with serine 
protease comprising SEQ ID NO:8. In some embodiments, the isolated serine proteases 
comprise at least 50% identity, preferably at least 55%, more preferably at least 60%, yet 
more preferably at least 65%, even more preferably at least 70%, more preferably at least 
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75%, still more preferably at least 80%, more preferably 85%, yet more preferably 90%, 
even more preferably at least 95%, and most preferably 99% identity. 

The present invention also provides compositions comprising isolated serine 
proteases having immunological cross-reactivity with the serine proteases obtained from the 
Micrococcineae. In some preferred embodiments, the serine proteases have 
immunological cross-reactivity with serine protease obtained from Cellulomonas 69B4. In 
alternative embodiments, the serine proteases have immunological cross-reactivity with 
serine protease comprising the amino acid sequence set forth in SEQ ID NO:8. In still 
further embodiments, the serine proteases have cross-reactivity with fragments (/.e., 
portions) of any of the serine proteases obtained from the Micrococcineae, the 
Cellulomonas 69B4 protease, and/or serine protease comprising the amino acid sequence 
set forth in SEQ ID NO:8. 

In some embodiments, the present invention provides the amino acid sequence set 
forth in SEQ ID NO:8, wherein the sequence comprises substitutions at least one amino 
acid position selected from the group comprising positions 2, 8, 10, 11, 12, 13, 14, 15, 16, 
24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 51, 54, 61, 64, 65, 67, 70, 71, 76, 78, 79, 81, 
83, 85, 86, 90, 93, 99, 100, 105, 107, 109, 112, 113, 116, 118, 119, 121, 123, 127, 145, 
155, 159, 160, 163, 165, 170, 174, 179, 183, 184, 185, 186, 187, and 188. In alternative 
embodiments, the sequence comprises substitutions at least one amino acid position 
selected from the group comprising positions 1, 4, 22, 27, 28, 30, 32, 41, 47, 48, 55, 59, 63, 
66, 69, 75, 77, 80, 84, 87, 88, 89, 92, 96, 110, 111, 114, 115, 117, 128, 134, 144, 143, 146, 
151, 154, 156, 158, 161, 166, 176, 177, 181, 182, 187, and 189. 

In some preferred embodiments, the present invention provides protease variants 
having an amino acid sequence comprising at least one substitution of an amino acid made 
at a position equivalent to a position in a Cellulomonas 69B4 protease comprising the amino 
acid sequence set forth in SEQ ID NO:8. In alternative embodiments, the present invention 
provides protease variants having an amino acid sequence comprising at least one 
substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 
69B4 protease comprising at least a portion of SEQ ID NO:8. In some embodiments, the 
substitutions are made at positions equivalent to positions 2, 8, 10, 1 1 , 12, 13, 14, 15, 16, 
24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 51, 54, 61, 64, 65, 67, 70, 71, 76, 78, 79, 81, 
83, 85, 86, 90, 93, 99, 100, 105, 107, 109, 112, 113, 116, 118, 119, 121, 123, 127, 145, 
155, 159, 160, 163, 165, 170, 174, 179, 183, 184, 185, 186, 187, and 188 in a Cellulomonas 
69B4 protease having an amino acid sequence set forth in SEQ ID NO:8. In alternative 
embodiments, the substitutions are made at positions equivalent to positions 1, 4, 22, 27, 
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28, 30, 32, 41, 47, 48, 55, 59, 63, 66, 69, 75, 77, 80, 84, 87, 88, 89, 92, 96, 110, 111, 114, 
115, 117, 128, 134, 144, 143, 146, 151, 154, 156, 158, 161, 166, 176, 177, 181, 182, 187, 
and 189, in a Cellulomonas 69B4 protease having an amino acid sequence set forth in SEQ 
ID NO:8. In some preferred embodiments, the protease variants comprise the amino acid 
sequence comprising SEQ ID NO:8, wherein at least one amino acid position at positions 
selected from the group consisting of 14, 16, 35, 36, 65, 75, 76, 79, 123, 127, 159, and 179, 
are substituted with another amino acid. In some particularly preferred embodiments, the 
proteases comprise at least one mutation selected from the group consisting of R14L, R16I, 
R16L, R16Q, R35F, T36S, G65Q, Y75G, N76L, N76V, R79T, R123L, R123Q, R127A, 
R127K, R127Q, R159K, R159Q, and R179Q. In some alternative preferred embodiments, 
the proteases comprise multiple mutations selected from the group consisting of 
R16Q/R35F/R159Q, R16Q/R123L, R14L/R127Q/R159Q, R14L7R179Q, 
R123L/R127Q/R179Q, R16Q/R79T/R127Q, and R16Q/R79T. In some particularly preferred 
embodiments, the proteases comprise the following mutations R123L, R127Q, and R179Q. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of T36I, A38R, 
N170Y, N73T, G77T, N24A, T36G, N24E, L69S, T36N, T36S, E119R, N74G, T36W, S76W, 
N24T, N24Q, T36P, S76Y, T36H, G54D, G78A, S187P, R179V, N24V, V90P, T36D, L69H, 
G65P, G65R, N7L, W103M, N55F, G186E, A70H, S76V, G186V, R159F, T36Y, T36V, 
G65V, N24M, S51A, G65Y, Q71I, V66H, P118A, T116F, A38F, N24H, V66D, S76L, G177M, 
G186I, H85Q, Q71K, Q71G, G65S, A38D, P118F, A38S, G65T, N67G, T36R, P118R, 
S114G, Y75I, I181H, G65Q, Y75G, T36F, A38H, R179M, T183I, G78S, A64W, Y75F, 
G77S, N24L, W103I, V3L, Q81V, R179D, G54R, T36L, Q71M, A70S, G49F, G54L, G54H, 
G78H, R179I, Q81K, V90I, A38L, N67L, T109I, R179N, V66I, G78T, R179Y, S187T, N67K, 
N73S, E119K, V3I, Q71H, I11Q, A64H, R14E, R179T, L69V, V150L, Q71A, G65L, Q71N, 
V90S, A64N, 111 A, N145I, H85T, A64Y, N145Q, V66L, S92G, S188M, G78D, N67A, N7S, 
V80H, G54K, A70D, P118H, D2G, G54M, Q81H, D2Q, V66E, R79P, A38N, N145E, R179L, 
T109H, R179K, V66A, G54A, G78N, T109A, R179A, N7A, R179E, H104K, A64R, and 
V80L In further embodiments, wherein the amino acid sequence of the protease variants 
comprise at least one substitution selected from the group consisting of H85R, H85L, T62I, 
N67H, G54I, N24F, T40V, T86A, G63V, G54Q, A64F, G77Y, R35F, T129S, R61M, I126L, 
S76N, T182V, R79G, T109P, R127F, R123E, P118I, T109R, 171S, T183K, N67T, P89N, 
F1T, A64K, G78I, T109L, G78V, A64M, A64S, T10G, G77N, A64L, N67D, S76T, N42H, 
D184F, D184R, S76I, S78R, A38K, V72I, V3T, T107S, A38V, F47I, N55Q, S76E, P118Q, 
T109G, Q71D, P118K, N67S, Q167N, N145G, I28L, 11 1T, A64I, G49K, G49A, G65A, 
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N170D, H85K, S185I, I181N, V80F, L69W, S76R, D184H, V150M, T183M, N67Q, S51Q, 
A38Y, T107V, N145T, Q71F, A83N, S76A, N67R, T151L, T163L, S51F, Q81I, F47M, A41N, 
P118E, N67Y, T107M, N73H, 67V, G63W, T10K, I181G, S187E, T107H, D2A, L142V, 
A143N, A8G, S187L, V90A, G49L, N170L, G65H, T36C, G12W, S76Q, A143S, F1A, N7H, 
S185V, A110T, N55K, N67F, N7I, A110S, N170A, Q81D, A64Q, Q71L, A38I, N112I, V90T, 
N145L, A64T, 11 1S, A30S, R123I, D2H, V66M, Q71R, V90L, L68W, N24S, R159E, V66N, 
D184Q, E133Q, A64V, D2N, G13M, T40S, S76K, G177S, G63Q, S15F, A8K, A70G, and 
A38G. In some preferred embodiments, these variants have improved casein hydrolysis 
performance as compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of R35E, R35D, 
R14E, R14D, Q167E, G49C, S15R, S15H, 11 1W, S15C, G49Q, R35Q, R35V, G49E, 
R123D, R123Y, G49H, A38D, R35S, F47R, R123C, T151L, R14T, R35T, R123E, G49A, 
G49V, D56L, R35N, R35A, G12D, R35C, R123N, T46V, R123H, S155C, T121E, R127E, 
S113C, R123T, R16E, T46F, T121L, A38C, T46E, R123W, T44E, N55G, A8G, E119G, 
R35P, R14G, F59W, R127S, R61E, R14S, S155W, R123F, R123S, G49N, R127D, E119Y, 
A48E, N170D, R159T, S99A, G12Q, P118R, F165W, R127Q, R35H, G12N, A22C, G12V, 
R16T, Y57G, T100A, T46Y, R159E, E119R, T107R, T151C, G54C, E119T; R61V, I11E, 
R14I, R61M, S15E, A22S, R16C, T36C, R16V, L125Q, M180L, R123Q, R14A, R14Q, 
R35M, R127K, R159Q, N112P, G124D, R179E, G49L, A41D, G177D, R123V, E119V, 
T10L, T109E, R179D, G12S, T10C, G91Q, S15Y, S155Y, R14C, T163D, T121F, R14N, 
F165E, N24E, A41C, R61T, G12I, P118K.T46C, I11T, R159D, N170C, R159V, S155I, 
11 1Q, D2P, T100R, R159S, S114C, R16D, and P134R. In alternative embodiments, the 
protease variants have amino acid sequences comprising at least one substitution selected 
from the group consisting of S99G, T100K, R127A, F1P, S155V, T128A, F165H, G177E, 
A70M, S140P, A87E, D2I, R159K, T36V, R179C, E119N, T10Y, I172A, A8T, F47V, W103L, 
R61K, D2V, R179V, D2T, R159N, E119A, G54E, R16Q, G49S, R16I, S51L, S155E, S15M, 
R179I, T10Q, G12H, R159C, R179T, T163C, R159A, A132S, N157D, G13E, L141M, A41T, 
R123M, R14M, A8R, Q81P, N24T, T10D, A88F, R61Q, S99K, R179Y, T121A, N112E, 
S155T, T151V, S99Q, T10E, S92T, T109K, T44C, R123A, A87C, S15F, S155F, D56F, 
T10F. A83H, R179M, T121D, G13D, P118C, G49F, Q174C, S114E, T86E, F1N, T115C, 
R127C, R123K, V66N, G12Y, S113A, S15N, A175T, R79T, R123G, R179S, R179N, R123I, 
P118A, S187E, N112D, A70G, E119L, E119S, R159M, R14H, R179F, A64C, A41S, 
R179W, N24G, T100Q, P118W, Q81G, G49K, R14L, N55A, R35K, R79V, D2M, T160D, 
A83D, R179L, S51A, G12P, S99H, N42D, S188E, T10M, L125M, T116N, A70P, Q174S, 
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G65D, S113D, E119Q, A83E, N170L, Q81A, S51C, P118G, Q174T, I28V, S15G, and 
T1 16G. In some preferred embodiments, these variants have improved LAS stability as 
compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of G26I, G26K, 
G26Q, G26V, G26W, F27V, F27W, I28P, T29E, T129W, T40D, T40Q, R43D, P43H, P43K, 
P43L, A22C, T40H, P89W, G91L, S18E, F59K, A30M, A30N, G31M, C33M, G161L, G161V, 
P43N, G26E, N73P, G84C, G84P, G45V, C33L, Y9E, Y9P, A147E, C158H, I28W, A48P, 
A22S, T62R, S137R, S155P, S155R, G156I, G156L, Q81A, R96C, I4D, I4P, A70P, C105E, 
C105G, C105K, C105M, C105N, C105S, T128A, T128V, T128G, S140P, G12D, C33N, 
C33E, T164G, G45A, G156P, S99A, Q167L, S155W, I28T, R96F, A30P, R123W, T40P, 
T39R, C105P, T100A, C105W, S155K, T46Y, R123F, I4G, S155Y, T46V, A93S, Y57N, 
Q81S, G186S, G31H, T10Y, G31V, A83H, A38D, R123Y, R79T, C158G, G31Y, Q81P, 
R96E, A30Y, R159K, A22T, T40N, Y57M, G31N, Q81G, T164L, T121E, T10F, Q146P, 
R123N, V3R, P43G, Q81H, Q81D, G161I, C158M, N24T, T10W, T128S, T160I, Y176P, 
S155F, T128C, L125A, P168Y, T62G, F166S, S188A, Q81F, T46W, A70G, and A38G. In 
alternative embodiments, the protease variants have amino acid sequences comprising at 
least one substitution selected from the group consisting of S188E, S188V, Y117K, Y117Q, 
Y117R, Y117V, R127K, R127Q, R123L, T86S, R123I, Q81E, L125M, H32A, S188T, N74F, 
C33D, F27I, A83M, Q71Y, R123T, V90A, F59W, L141C, N170E, T46F, S51V, G162P, 
S185R, A41S, R79V, T151C, T107S, T129Y, M180L, F166C, C105T, T160E, P89A, R159T, 
T183P, S188M, T10L, G25S, N24S, E119L, T107L, T107Q, G161K, G15Q, S15R, G153K, 
G153V, S188G, A83E, G186P, T121D, G49A, S15C, C105Y, C105A, R127F, Q71A, T10C, 
R179K, T86I, W103N, A87S, F166A, A83F, R123Q, A132C, A143H, T163I, T39V, A93D, 
V90M, R123K, P134W, G177N, V115I, S155T, T110D, G105L, N170D, T107A, G84V, 
G84M, L111K, P168I, G154L, T183I, S99G, S15T, A8G, S15N, P189S, S188C, T100Q, 
A110G, A121A, G12A, R159V, G31A, G154R, T182L, Vl'l5L, T160Q, T107F, R159Q, 
G144A, S92T, T101S, A83R, G12HM S15H, T116Q, T36V, G154, Q81C, V130T, T183A, 
P1 18T, A87E, T86M, V150N, and N24E. In some preferred embodiments, these variants 
have improved thermostability as compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of T36I, I172T, 
N24E, N170Y, G77T, G186N, 1181 L, N73T, A38R, N74G, N24A, G54D, S76D, R123E, 
159E, N112E, R35E, R179V, R123D, N24T, R179T, R14L, A38D, V90P, R14Q, R123I, 
R179D, S76V, R79G, R35L, S76E, S76Y, R79D, R79P, R35Q, R179N, N112D, R179E, 
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G65P, Y75G, V90S, R179M, R35F, R123F, A64I, N24Q, R14I, R179A, R127A, R179I, 
N170D, R35A, R159F, T109E, R14D, N67D, G49A, N112Q, G78D, T121E, L69S, T116E, 
V90I, T36S, T36G, N145E, T86D, S51D, R179K, T107E, T129S, L142V, R79A, R79E, 
A38H, T107S, R123A, N55E, R123L, R159N, G65D, R14N, G65Q, R123Q, N24V, R14G, 
T116Q, A38N, R159Q, R179Y, A83E, N112L, S99N, G78A, T10N, H85Q, R35Q, N24L, 
N24H, G49S, R79L, S76T, S76L, G65S, N55F, R79V, G65T, R123N, T86E, Y75F, F1T, 
S76N, S99V, R79T, N112V, R79M, T107V, R79S, G54E, G65V, R127Q, R159D, T107H, 
H85T, R35T, T36N, Q81E, R123H, S76I, A38F, V90T, and R14T. In alternative 
embodiments, the protease variants have amino acid sequences comprising at least one 
substitution selected from the group consisting of G65L, S99D, T107M, S113T, S99T, 
G77S, R14M, A64N, R61M, A70D, Q71G, A93D, S92G, N112Y, S15W, R159K, N67G, 
T10E, R127H, A64Y, R159C, A38L, T160E, T183E, R127S, A8E, S51Q, N7L, G63D, A38S, 
R35H, R14K, T107I, G12D, A64L, S76W, A41N, R35M, A64V, A38Y, T183I, W103M, A41D, 
R127K, T36D, R61T, G65Y, G13S, R35Y, R123T, A64H, G49H, A70H, A64F, R127Y, 
R61E, A64P, T121D, V115A, R123Y, T101S, T182V, H85L, N24M, R127E, N145D, Q71H, 
S76Q, A64T, G49F, A64Q, T10D, F1D, A70G, R35W, Q71D, N121I, A64M, T36H, A8G, 
T107N, R35S, N67T, S92A, N170L, N67E, S114A, R14A, R14S, Q81D, S51H, R123S, 
A93S, R127F, 119V, T40V, S185N, R123G, R179L, S51V, T163D, T109I, A64S, V72I, 
N67S, R159S, H85M, T109G, Q71S, R61H, T107A, Q81V, V90N, T109A, A38T, N145T, 
R159A, A110S, Q81H, A48E, S51T, A64W, R159L, N67H, A93E, T116F, R61S, R123V, 
V3L, and R159Y. In some preferred embodiments, these variants have improved keratin 
hydrolysis activity as compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of T36I, P89D, 
A93T, A93S, T36N, N73T, T36G, R159F, T36S, A38R, S99W, S76W, T36P, G77T, G54D, 
R127A, R159E, H85Q, T36D, S76L, S99N, Y75G. S76Y, R127S, N24E, R127Q, D184F, 
N170Y, N24A, S76T, H85L, Y75F, S76V, L69S, R159K, R127K, G65P, N74G, R159H, 
G65Q, G186V, A48Q, T36H, N67L, R14I, R127L, T36Y, S76I, S114G, R127H, S187P, V3L, 
G78D, R123I, I181Q, R35F, H85R, R127Y, N67S, Q81P, R123F, R159N, S99A, S76D, 
A132V, R127F, A143N, S92A, N24T, R79P, S76N, RUM, G186E, N24Q, N67A, R127T, 
H85K, G65T, G65Y, R179V, Y751, 11 1Q, A38L, T36L, R159Y, R159D, N24V, G65S, N157D, 
G186I, G54Q, N67Y, R127G, S76A, A38S, T109E, V66H, T116F, R123L, G49A, A64H, 
T36W, D184H, S99D, G161K, P134E, A64F, N67G, S99T, D2Q, S76E, R16Q, G54N, 
N67V, R35L, Q71I, N7L, N112E, L69H, N24H, G54I, R16L, N24M, A64Y, S113A, H85F, 
R79G, 111 A, T121D, R61V, and G65L In alternative embodiments, the protease variants 
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have amino acid sequences comprising at least one substitution selected from the group 
consisting of N67Q, S187Q, Q71H, T163D, R61K, R159V, Q71F, V31F, V90I, R79D, 
T160E, R123Q, A38Y, S113G, A88F, A70G, 11 1T, G78A, N24L, S92G, R14L, D184R, 
G54L, N112L, H85Y, R16N, G77S, R179T, V80L, G65V, T121E, Q71D, R16G, P89N, 
N42H, G49F, 11 1S, R61M, R159C, G65R, T183I, A93D, L111E, S51Q, G78N, N67T, A38N, 
T40V, A64W, R159L, T10E, R179K, R123E, V90P, A64N, G161E, H85T, A8G, L142V, 
A41N, S185I, Q71L, A64T, R16I, A38D, G54M, N112Q, R16A, R14E, V80H, N170D, S99G, 
R179N, S15E, G49H, A70P, A64S, G54A, S185W, R61H, T10Q, A38F, N170L, T10L, 
N67F, G12D, D184T, R14N, S187E, RHP, N112D, S140A, N112G G49S, L111D, N67M, 
V150L, G12Y, R123K, P89V, V66D, G77N, S51T, A8D, I181H, T86N, R179D, N55F, N24S, 
D184L, R61S, N67K, G186L, F1T, R159A, I11L, R61T, D184Q, A93E, Q71T, R179E, 
L69W, T163I, S188Q, L125V, A38V, R35A, P134G, A64V, N145D, V90T, and A143S. In 
some preferred embodiments, these variants have improved BMI performance as compared 
to wild-type Cellulomonas 69B4 protease. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of T36I, N170Y, 
A38R, R79P, G77T, L69S, N73T, S76V, S76Y, R179V, T36N, N55F, R159F, G54D, G65P, 
L69H, T36G, G177M, N24E, N74G, R159E, T36S, Y75G, S76I, S76D, A8R, A24A, V90P, 
R159C, G65Q, T121E, A8V, S76L, T109E, R179M, A8T, T107N, G186E, S76W, R123E, 
A38F, T36P, N67G, Y75F, S76N, R179I, S187P, N67V, V90S, R127A, R179Y, R35F, 
N145S, G65S, R61M, S51A, R179N, R123D, N24T, N55E, R79C, G186V, R123I, G161E, 
G65Y, A38S, R14L, V90I, R79G, N145E, N67L, R127S, R150Y, M180D, N67T, A93D, 
T121D, Q81V, T109I, A93E; T107S, R179T, R179L, R179K, R159D, R179A, R79E, R123F, 
R79D, T36D, A64N, L142V, T109A, 1172V, A83N, T85A, R179D, A38L, I126L, R127Q, 
R127L, L69W, R127K, G65T, R127H, P134A, N67D, R14M, N24Q, A143N, N55S, N67M., 
S51D, S76E, T163D, A38D, R159K, T183I, G63V, A8S, T107M, H85Q, N1 12E, N67F, 
N67S, A64H, T86I, P134E, T182V, N67Y, A64S, G78D, V90T, R61T, R16Q, G65R, T86L, 
V90N, R159Q, G54I, S76C, R179E, V66D, L69V, R127Y, R35L, R14E, and T86F. In 
alternative embodiments, the protease variants have amino acid sequences comprising at 
least one substitution selected from the group consisting of G186I, A64Q, T109G, G64L, 
N24L, A8E, N112D, A38H, R179W, S114G, R123L, A8L, T129S, N170D, R159N, N67C, 
S92C, T107A, G54E, T107E, T36V, R127T, A8N, H85L, A110S, N170C, A64R, A132V, 
T36Y, G63D, W103M, T151V, R123P, W103Y, S76T, S187T, R127F, N67A, P171M, A70S, 
R159H, S76Q, L125V, G54Q, G49L, R14I, R14Q, A83I, V90L, T183E, R159A, T101S, 
G65D, G54A, T107Q, Q71M, T86E, N24M, N55Q, R61V, P134D, R96K, A88F, N145Q, 
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A64M, A64T, N24V, S140A, A8H, A64I, R123Q, T183Q, N24H, A64W, T62I, T129G, R35A, 
T40V, I11T, A38N, N145G, A175T, G77Q, T109H, A8P, R35E, T109N, A110T, N67Q, 
G63P, H85R, S92G, A175V, S51Q, G63Q, T116F, G65A, R79L, N145P, L69Q, Q146D, 
A83D, F166Y, R123A, T121L, R123H, A70P, T182W, S76A, A64F, T107H, G186L,Q81I, 
5 R123K, A64L, N67R, V3L, S187E, S161K, T86M, I4M, G77N, G49A, A41N, G54M, T107V, 
Q81E, A38I, T109L, T183K, A70G, Q71D, T183L, Q81H, A64V, A93Q, S188E, S51F, 
G186P, G186T, R159L, P134G, N145T, N55V, V66E, R159V, Y176L, and R16L . In some 
preferred embodiments, these variants have improved BMI performance under low pH 
conditions, as compared to wild-type Cellulomonas 69B4 protease. 

10 . .. The present invention also provides serine proteases comprising at least a portion 
of an amino acid sequence selected from the group consisting of SEQ ID NO:8, SEQ ID 
NO:6, SEQ ID NO:7, and SEQ ID NO:9. In some embodiments, the nucleotide sequences 
encoding these serine proteases comprise a nucleotide sequence selected from the group 
consisting of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, and SEQ ID NO:5. 

is In some embodiments, the serine proteases are variants having amino acid sequences that 
are similar to that set forth in SEQ ID NO:8. In some preferred embodiments, the proteases 
are obtained from a member of the Micrococcineae. In some particularly preferred 
embodiments, the proteases are obtained from an organism selected from the group 
consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and 

20 Promicromonospora. In some particularly preferred embodiments, the protease is obtained 
from variants of Cellulomonas 69B4. 

The present invention also provides isolated protease variants having amino acid 
sequences comprising at least one substitution of an amino acid made at a position 
equivalent to a position in a Cellulomonas 69B4 protease comprising the amino acid 

25 sequence set forth in SEQ ID NO:8, wherein the amino acid of the protease comprises 
Arg14, Ser15, Arg16, Cys17, His32, Cys33, Phe52, Asp56, ThrTOO, Val115, Thr116, 
Tyr117, Pro118, Glu119, Ala132, Glu133, Pro134, Glyi35, Asp136, Ser137, ThM51, 
Ser152, Gly153, Gly154, Ser155, Gly156, Asn157, Thr164, and Phe165. In some , 
embodiments, the catalytic triad of the proteases comprises His 32, Asp56, and Ser137. In 

30 alternative embodiments, the proteases comprise Cys131 , Ala132, Glu133, Pro134, Gly135, 
Thr151, Ser152, Gly153, Gly154, Ser155, Gly156, Asn157 and Gly 162, Thr 163, and 
Thr164. In some preferred embodiments, the amino acid sequence of the proteases 
comprise Phe52, Tyr1 1 7, Pro1 1 8 and Glu1 1 9. In some particularly preferred embodiments, 
the amino acids sequences of the proteases have main-chain to main-chain hydrogen 

35 bonding from Gly 154 to the substrate main-chain. 
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In embodiments, the proteases of the present invention comprise three disulfide 
bonds. In some preferred embodiments, the disulfide bonds are located between C17 and 
C38, C95 and C105, and C131 and C158. In some particularly preferred embodiments, the 
disulfide bonds are located between C17 and C38, C95 and C105, and C131 and C158 of 
SEQ ID NO:8. In alternative protease variant embodiments, the disulfide bonds are located 
at positions equivalent to the disulfide bonds in SEQ ID NO:8. 

The present invention also provides isolated protease variants having amino acid 
sequences comprising at least one substitution of an amino acid made at a position 
equivalent to a position in a Cellulomonas 69B4 protease comprising the amino acid 
sequence set forth in SEQ ID NO:8, wherein the variants have altered substrate specificities 
as compared to wild-type Cellulomonas 69B4 protease. In some further preferred 
embodiments, the variants have altered pis as compared to wild-type Cellulomonas 69B4 
protease. In additional preferred embodiments, the variants have improved stability as 
compared to wild-type Cellulomonas 69B4 protease. In still further preferred embodiments, 
the variants exhibit altered surface properties. In some particularly preferred embodiments, 
the variants exhibit altered surface properties as compared to wild-type Cellulomonas 69B4 
protease. In additional particularly preferred embodiments, the variants comprise mutations 
at least one substitution at sites selected from the group consisting of 1 , 2, 4, 7, 8, 1 0, 1 1 , 
12, 13, 14, 15, 16, 22, 24, 25, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 
49, 50, 51, 52, 53, 54, 55, 57, 59, 61, 62, 63, 64, 65, 66, 67, 68, 69, 71, 73, 74, 75, 76, 77, 
78, 79, 80, 81, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 95, 99, 100, 101, 102, 103, 104, 
105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 123, 
124, 126, 127, 128, 130, 131, 132, 133, 134, 135, 137, 143, 144, 145, 146, 147, 148, 152, 
153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 170, 171, 
173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, and 184. 

The present invention also provides protease variants having at least one improved 
property as compared to the wild-type protease. In some particularly preferred 
embodiments, the variants are variants of a serine protease obtained from a member of the 
Micrococcineae. In some particularly preferred embodiments, the proteases are obtained 
from an organism selected from the group consisting of Cellulomonas, Oerskovia, 
Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some particularly 
preferred embodiments, the protease is obtained from variants of Cellulomonas 69B4. In 
some preferred embodiments, at least one improved property is selected from the group 
consisting of acid stability, thermostability, casein hydrolysis, keratin hydrolysis, cleaning 
performance, and LAS stability. 
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The present invention also provides expression vectors comprising a polynucleotide 
sequence encoding protease variants having amino acid sequences comprising at least one 
substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 
69B4 protease comprising the amino acid sequence set forth in SEQ ID NO:8. In further 
embodiments, the present invention provides host cells comprising these expression 
vectors. In some particularly preferred embodiments, the host cells are selected from the 
group consisting of Bacillus sp M Streptomyces sp. f Aspergillus sp., and Trichoderma sp. 
The present invention also provides the serine proteases produced by the host cells. 

The present invention also provides variant proteases comprising an amino acid 
sequence selected from the group consisting of SEQ ID NOS:54, 56, 58, 60, 62, 64, 66, 68, 
70, 72, 74, 76, and 78, In some preferred embodiments, the amino acid sequence is 
encoded by a polynucleotide sequence selected from the group consisting of SEQ ID 
NOS:53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, and 77. In further embodiments, the 
present invention provides expression vectors comprising a polynucleotide sequence 
encoding at least one protease variant. In additional embodiments, the present invention 
provides host cells comprising these expression vectors. In some particularly preferred 
embodiments, the host cells are selected from the group consisting of Bacillussp., 
Streptomyces sp., Aspergillus sp., and Trichoderma sp. The present invention also provides 
the serine proteases produced by the host cells. 

The present invention also provides compositions comprising at least a portion of an 
isolated serine protease of obtained from a member of the Micrococcineae, wherein the 
protease is encoded by a polynucleotide sequence selected from the group consisting of 
SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, and SEQ ID NO:4. In some preferred 
embodiments, the sequence comprises at least a portion of SEQ ID NO:1. In further 
embodiments, the present invention provides host cells comprising these expression 
vectors. In some particularly preferred embodiments, the host cells are selected from the 
group consisting of Bacillus sp., Streptomyces sp., Aspergillus sp., and Trichoderma sp. 
The present invention also provides the serine proteases produced by the host cells. 

The present invention also provides variant serine proteases, wherein the proteases 
comprise at least one substitution corresponding to the amino acid positions in SEQ ID 
NO:8, and wherein variant proteases have better performance in at least one property 
selected from the group consisting of keratin hydrolysis, thermostability, casein activity, LAS 
stability, and cleaning, as compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides isolated polynucleotides comprising a nucleotide 
sequence (i) having at least 70% identity to SEQ ID NO:4, or (ii) being capable of hybridizing 
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to a probe derived from the nucleotide sequence set forth in SEQ ID NO:4 t under conditions 
of intermediate to high stringency, or (iii) being complementary to the nucleotide sequence 
set forth in SEQ ID NO:4. In embodiments, the present invention provides expression 
vectors encoding at least one such polynucleotide. In further embodiments, the present 

5 invention provides host cells comprising these expression vectors. In some particularly 
preferred embodiments, the host cells are selected from the group consisting of Bacillus sp. t 
Streptomyces sp M Aspergillus sp., and Trichoderma sp. The present invention also provides 
the serine proteases produced by the host cells. In further embodiments, the present 
invention provides polynucleotides that are complementary to at least a portion of the 

io sequence set forth in SEQ ID N0:4. 

The present invention also provides methods of producing an enzyme having 
protease activity, comprising: transforming a host cell with an expression vector comprising 
a polynucleotide having at least 70% sequence identity to SEQ ID NO:4; cultivating the 
transformed host cell under conditions suitable for host cell. In some embodiments, the host 

15 cell is selected from the group consisting of Streptomyces, Aspergillus, Trichoderma and 
Bacillus species. 

The present invention also provides probes comprising 4 to 1 50 nucleotide sequence 
substantially identical to a corresponding fragment of SEQ ID NO:4, wherein the probe is 
used to detect a nucleic acid sequence coding for an enzyme having proteolytic activity, and 

20 wherein the nucleic acid sequence is obtained from a member of the Micrococcineae. In 
some embodiments, the Micrococcineae is a Cellulomonas spp. In some preferred 
embodiments, the Cellulomonas is Cellulomonas strain 69B4. 

The present invention also provides cleaning compositions comprising at least one 
serine protease obtained from a member of the Micrococcineae. In some embodiments, ate 

25 least one protease is obtained from an organism selected from the group consisting of 
Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In 
some preferred embodiments, the protease is obtained from Cellulomonas 69B4. In some 
particularly preferred embodiments, at least one protease comprises the amino acid 
sequence set forth in SEQ ID NO:8. In some further embodiments, the present invention 

30 provides isolated serine proteases comprising at least 45% amino acid identity with serine 
protease comprising SEQ ID NO:8. In some embodiments, the isolated serine proteases 
comprise at least 50% identity, preferably at least 55%, more preferably at least 60%, yet 
more preferably at least 65%, even more preferably at least 70%, more preferably at least 
75%, still more preferably at least 80%, more preferably 85%, yet more preferably 90%, 

35 even more preferably at least 95%, and most preferably 99% identity. 75. 
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The present invention further provides cleaning compositions comprising at least one 
serine protease, wherein at least one of the serine proteases has immunological cross- 
reactivity with the serine protease obtained from a member of the Micrococcineae. In some 
preferred embodiments, the serine proteases have immunological cross-reactivity with 
serine protease obtained from Cellulomonas 69B4. In alternative embodiments, the serine 
proteases have immunological cross-reactivity with serine protease comprising the amino 
acid sequence set forth in SEQ ID NO:8. In still further embodiments, the serine proteases 
have cross-reactivity with fragments {i.e., portions) of any of the serine proteases obtained 
from the Micrococcineae, the Cellulomonas 69B4 protease, and/or serine protease 
comprising the amino acid sequence set forth in SEQ ID NO:8. 

The present invention further provides cleaning compositions comprising at least one 
serine protease, wherein the protease is a variant protease having an amino acid sequence 
comprising at least one substitution of an amino acid made at a position equivalent to a 
position in a Cellulomonas 69B4 protease having an amino acid sequence set forth in SEQ 
ID NO:8. In some embodiments, the substitutions are made at positions equivalent to 
positions 2, 8, 10, 11, 12, 13, 14, 15, 16, 24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 51, 
54, 61, 64, 65, 67, 70, 71, 76, 78, 79, 81, 83, 85, 86, 90, 93, 99, 100, 105, 107, 109, 112, 
113, 116, 118, 119, 121, 123, 127, 145, 155, 159, 160, 163, 165, 170, 174, 179, 183, 184, 
185, 186, 187, and 188 in a Cellulomonas 69B4 protease comprising an amino acid 
sequence set forth in SEQ ID NO:8. In alternative embodiments, the substitutions are made 
at positions equivalent to positions 1 , 4, 22, 27, 28, 30, 32, 41 , 47, 48, 55, 59, 63, 66, 69, 75, 
77, 80, 84, 87, 88, 89, 92, 96, 110, 111, 114, 115, 117, 128, 134, 144, 143, 146, 151, 154, 
156, 158, 161, 166, 176, 177, 181, 182, 187, and 189, in a Cellulomonas 69B4 protease 
comprising an amino acid sequence set forth in SEQ ID NO:8. In further embodiments, the 
protease comprises at least one amino acid substitutions at positions 14, 16, 35, 36, 65, 75, 
76, 79, 123, 127, 159, and 179, in an equivalent amino acid sequence to that set forth in 
SEQ ID NO:8. In still further embodiments, the protease comprises at least one mutation 
selected from the group consisting of R14L, R16I, R16L, R16Q, R35F, T36S, G65Q, Y75G, 
N76L, N76V, R79T, R123L, R123Q, R127A, R127K, R127Q, R159K, R159Q, and R179Q. 
In yet additional embodiments, the protease comprises a set of mutations selected from the 
group consisting of the sets R16Q/R35F/R159Q, R16Q/R123L, R14L/R127Q/R159Q, 
R14L/R179Q, R123UR127Q/R179Q, R16Q/R79T/R127Q, and R16Q/R79T. In some 
particularly preferred embodiments, the protease comprises the following mutations R123L, 
R127Q, and R179Q. In some particularly preferred embodiments, the variant serine 
proteases comprise at least one substitution corresponding to the amino acid positions in 
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SEQ ID NO:8, and wherein the variant proteases have better performance in at least one 
property selected from the group consisting of keratin hydrolysis, thermostability, casein 
activity, LAS stability, and cleaning, as compared to wild-type Cellulomonas 69B4 protease. 
In some embodiments, the variant protease comprises an amino acid sequence selected 
from the group consisting of SEQ ID NOS:54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 
78. In alternative embodiments, the variant protease amino acid sequence is encoded by a 
polynucleotide sequence selected from the group consisting of SEQ ID NOS:53, 55, 57, 59, 
61 , 63, 65, 67, 69, 71 , 73, 75, and 77. 

The present invention also provides cleaning compositions comprising a cleaning 
effective amount of a proteolytic enzyme, the enzyme comprising an amino acid sequence 
having at least 70 % sequence identity to SEQ ID NO:4, and a suitable cleaning formulation. 
In some preferred embodiments, the cleaning compositions further comprise one or more 
additional enzymes or enzyme derivatives selected from the group consisting of proteases, 
amylases, lipases, mannanases, pectinases, cutinases, oxidoreductases, hemicellulases, 
and cellulases. 

The present invention also provides compositions comprising at least one serine 
protease obtained from a member of the Micrococcineae, wherein the compositions further 
comprise at least one stabilizer. In some embodiments, the stabilizer is selected from the 
group consisting of borax and glycerol. In some embodiments, the present invention 
provides competitive inhibitors suitable to stabilize the fenzyme of the present invention to 
anionic surfactants. In some embodiments, at least one protease is obtained from an 
organism selected from the group consisting of Cellulomonas, Oerskovia, 
Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some preferred 
embodiments, the protease is obtained from Cellulomonas 69B4. In some particularly 
preferred embodiments, at least one protease comprises the amino acid sequence set forth 
in SEQ ID NO:8. 

The present invention further provides compositions comprising at least one serine 
protease obtained obtained from a member of the Micrococcineae, wherein the serine 
protease is an autolytically stable variant. In some embodiments, at least one variant 
protease is obtained from an organism selected from the group consisting of Cellulomonas, 
Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some 
preferred embodiments, the variant protease is obtained from Cellulomonas 69B4. In some 
particularly preferred embodiments, at least one variant protease comprises the amino acid 
sequence set forth in SEQ ID NO:8. 

The present invention also provides cleaning compositions comprising at least 
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0.0001 weight percent of the serine protease of the present invention, and optionally, an 
adjunct ingredient. In some embodiments, the composition comprises an adjunct ingredient. 
In some preferred embodiments, the composition comprises a sufficient amount of a pH 
modifier to provide the composition with a neat pH of from about 3 to about 5, the 
composition being essentially free of materials that.hydrolyze at a pH of from about 3 to 
about 5. In some particularly preferred embodiments, the materials that hydrolyze comprise 
a surfactant material. In additional embodiments, the cleaning composition is a liquid 
composition. In further embodiments, the surfactant material comprises a sodium alkyl 
sulfate surfactant that comprises an ethylene oxide moiety. 

The present invention additionally provides cleaning compositions that comprise at 
least one acid stable enzyme, the cleaning composition comprising a sufficient amount of a 
pH modifier to provide the composition with a neat pH of from about 3 to about 5, the 
composition being essentially free of materials that hydrolyze at a pH of from about 3 to 
about 5. In further embodiments, the materials that hydrolyze comprise a surfactant 
material. In some preferred embodiments, the cleaning composition being a liquid 
composition. In yet additional embodiments, the surfactant material comprises a sodium 
alkyl sulfate surfactant that comprises an ethylene oxide moiety. In some embodiments, 
the cleaning composition comprises a suitable adjunct ingredient. In some additional 
embodiments, the composition comprises a suitable adjunct ingredient. In some preferred 
embodiments, the composition comprises from about 0.001 to about 0.5 weight % of ASP. 

In some alternatively preferred embodiments, the composition comprises from about 0.01 to 
about 0.1 weight percent of ASP. 

The present invention also provides methods of cleaning, the comprising the steps 
of: a) contacting a surface and/or an article comprising a fabric with the cleaning 
composition comprising the serine protease of the present invention at an appropriate 
concentration; and b) optionally washing and/or rinsing the surface or material. In 
alternative embodiments, any suitable composition provided herein finds use in these 
methods. 

The present invention also provides animal feed comprising at least one serine 
protease obtained from a member of the Micrococcineae. In some embodiments, at least 
one protease is obtained from an organism selected from the group consisting of 
Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In 
some preferred embodiments, the protease is obtained from Cellulomonas 69B4. In some 
particularly preferred embodiments, at least one protease comprises the amino acid 
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sequence set forth in SEQ ID NO:8. 

The present invention provides an isolated polypeptide having proteolytic activity, 
{e.g., a protease) having the amino acid sequence set forth in SEQ ID NO:8. In some 
embodiments, the present invention provides isolated polypeptides having approximately 

5 40% to 98% identity with the sequence set forth in SEQ ID NO:8. In some preferred 

embodiments, the polypeptides have approximately 50% to 95% identity with the sequence 
set forth in SEQ ID NO:8. In some additional preferred embodiments, the polypeptides have 
approximately 60% to 90% identity with the sequence set forth in SEQ ID NO:8. In yet 
additional embodiments, the polypeptides have approximately 65% to 85% identity with the 

10 sequence set forth in SEQ ID NO:8. In some particularly preferred embodiments, the 

polypeptides have approximately 90% to 95% identity with the sequence set forth in SEQ ID 
NO:8. 

The present invention further provides proteases obtained from bacteria of the 
suborder Micrococcineae. In some preferred embodiments, the proteases are obtained 

is from members of the family Promicromonosporaceae. In yet further embodiments, the 
proteases are obtained from any member of the genera Xylanimicrobium, Xylanibacterium, 
Xylanimonas, Myceligenerans, and Promicromonospora. In some preferred embodiments, 
the proteases are obtained from members of the family Cellulomonadaceae. In Some 
particularly preferred embodiments, the proteases are obtained from members of the genera 

20 Cellulomonas and Oerskovia. In some further preferred embodiments, the proteases are 
derived from Cellulomonas spp. In some embodiments, the Cellulomonas spp. is selected 
from Cellulomonas fimi, Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas 
hominis, Cellulomonas flavigena, Cellulomonas persica, Cellulomonas iranensis, 
Cellulomonas gelida t Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, 

25 Cellulomonas fermentans, Cellulomonas xylanilytica, Cellulomonas humilata and 
Cellulomonas strain 69B4 (DSM 1 6035). 

In alternative embodiments, the proteases are derived from Oerskovia spp. In some 
preferred embodiments, the Oerskovia spp. is selected from Oerskovia jenensis, Oerskovia 
paurometabola, Oerskovia enterophila, Oerskovia turbata and Oerskovia turbata strain DSM 

so 20577. 

In some embodiments, the proteases have apparent molecular weights of about 
17kD to 21 kD as determined by a matrix assisted laser desorption/ionizatori - time of flight 
("MALDI-TOF") spectrophotometer. 

The present invention further provides isolated polynucleotides that encode 
35 proteases comprise an amino acid sequence comprising at least 40% amino acid sequence 
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identity to SEQ ID NO:8. In some embodiments, the proteases have at least 50% amino 
acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 
60% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases 
have at least 70% amino acid sequence identity to SEQ ID NO:8. In some embodiments, 
the proteases have at least 80% amino acid sequence identity to SEQ ID NO:8. In some 
embodiments, the proteases have at least 90% amino acid sequence identity to SEQ ID 
NO:8. In some embodiments, the proteases have at least 95% amino acid sequence 
identity to SEQ ID NO:8. The present invention also provides expression vectors comprising 
any of the polynucleotides provided above. 

jhe present invention further provides host cells transformed with the expression 
vectors of the present invention, such that at least one protease is expressed by the host 
cells. In some embodiments, the host cells are bacteria, while in other embodiments, the 
host cells are fungi. In some preferred embodiments, the bacterial host cells are selected 
from the group consisting of the genera Bacillus and Streptomyces. In some alternative 
preferr.ed embodiments, the fungal host cells are members of the genus Trichoderma, while 
in other alternative preferred embodiments, the fungal host cells are members of the genus 
Aspergillus. 

The present invention also provides isolated polynucleotides comprising a nucleotide 
sequence (i) having at least 70% identity to SEQ ID NOS:3 or 4, or (ii) being capable of 
hybridizing to a probe. derived from the nucleotide sequence disclosed in SEQ ID NOS: 3 or 
4, under conditions of medium to high stringency, or (iii) being complementary to the 
nucleotide sequence disclosed in SEQ ID NOS:3 or 4. In some embodiments, the present 
invention provides vectors comprising such polynucleotide. In further embodiments, the 
present invention provides host cells transformed with such vector. 

The present invention further provides methods for producing at least one enzyme 
having protease activity, comprising: the steps of transforming a host cell with an expression 
vector comprising a polynucleotide comprising at least 70% sequence identity to SEQ ID 
NO:4, cultivating the transformed host cell under conditions suitable for the host cell to 
produce the protease; and recovering the protease. In some preferred embodiments, the 
host cell is a Streptomyces spp, while in other embodiments, the host cell is a Bacillus spp„ 
a Trichoderma spp., and/or a Aspergillus spp. In some embodiments, the Streptomyces 
spp. is Streptomyces Hvidans. In alternative embodiments, the host cell is T. reesei. In 
further embodiments, the Aspergillus spp. is A. niger. 

The present invention also provides fragments (/.e., portions) of the DNA encoding 
the proteases provided herein. These fragments find use in obtaining partial length DNA 
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fragments capable of being used to isolate or identify polynucleotides encoding mature 
protease enzyme described herein from Cellulomonas 69B4, or a segment thereof having 
proteolytic activity. In some embodiments, portions of the DNA provided in SEQ ID NO:1 
find use in obtaining homologous fragments of DNA from other species, and particularly 
5 from Micrococcineae spp. which encode a protease or portion thereof having proteolytic 
activity. 

The present invention further provides at least one probe comprising a 
polynucleotide substantially identical to a fragment of SEQ ID NOS:1 , 2, 3 or 4, wherein the 
probe is used to detect a nucleic acid sequence coding for an enzyme having proteolytic 

10 activity, and wherein the nucleic acid sequence is obtained from a bacterial source. In some 
embodiments, the bacterial source is a Cellulomonas spp. In some preferred embodiments, 
the bacterial source is Cellulomonas strain 69B4. 

The present invention further provides compositions comprising at least one of the 
proteases provided herein. In some preferred embodiments, the compositions are cleaning 

is compositions. In some embodiments, the present invention provides cleaning compositions 
comprising a cleaning effective amount of at least one protease comprising an amino acid 
sequence having at least 40% sequence identity to SEQ ID NO:8, at least 90% sequence 
identity to SEQ ID NO:8, and/or having an amino acid sequence of SEQ ID NO:8. In some 
embodiments, the cleaning compositions further comprise at least one suitable cleaning 

20 adjunct. In some embodiments, the protease is derived from a Cellulomonas sp. In some 
preferred embodiments, the Cellulomonas spp. is selected from Cellulomonas fimi, 
Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas hominis, Cellulomonas 
flavigena, Cellulomonas persica, Cellulomonas iranensis, Cellulomonas gelida, 
Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, and Cellulomonas strain 

25 69B4 (DSM 16035). In some particularly preferred embodiments, the Cellulomonas spp is 
Cellulomonas. strain 69B4. In still further embodiments, the cleaning composition further 
comprises at least one additional enzymes or enzyme derivatives selected from the group 
consisting of protease, amylase, lipase, mannanase and cellulase. 

The present invention also provides isolated naturally occurring proteases 

30 comprising an amino acid sequence having at least 45% sequence identity to SEQ ID NO:8, 
at least 60% sequence identity to SEQ ID NO:8, at least 75% sequence identity to SEQ ID 
NO:8, at least 90% sequence identity to SEQ ID NO:8, at least 95% sequence identity to 
SEQ ID NO:8, and/or having the sequence identity of SEQ ID NO:8, the protease being 
isolated from a Cellulomonas spp.. In some embodiments, the protease is isolated from 

35 Cellulomonas strain 69B4 (DSM 1 6035). 
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In additional embodiments, the present invention provides engineered variants of the 
serine proteases of the present invention. In some embodiments, the engineered variants 
are genetically modified using recombinant DNA technologies, while in other embodiments, 
the variants are naturally occurring. The present invention further encompasses engineered 
variants of homologous enzymes. In some embodiments, the engineered variant 
homologous proteases are genetically modified using recombinant DNA technologies, while 
in other embodiments, the variant homologous proteases are naturally occurring. 

The present invention also provides serine proteases that immunologically cross- 
react with the Cellulomonas 69B4 protease (/.a, ASP) of the present invention. Indeed, it is 
intended that the present invention encompass fragments (e.g., epitopes) of the ASP 
protease that stimulate an immune response in animals (including, but not limited to 
humans) and/or are recognized by antibodies of any class. The present invention further 
encompasses epitopes on proteases that are cross-reactive with ASP epitopes. In some 
embodiments, the ASP epitopes are recognized by antibodies, but do not stimulate an 
immune response in animals (including, but not limited to humans), while in other 
embodiments, the ASP epitopes stimulate an immune response in at least one animal 
species (including, but not limited to humans) and are recognized by antibodies of any class. 
The present invention also provides means and compositions for identifying and assessing 
cross-reactive epitopes. 

The present invention further provides at least one polynucleotide encoding a signal 
peptide (i) having at least 70% sequence identity to SEQ ID NO:9, or (ii) being capable of 
hybridizing to a probe derived from the polypeptide sequence encoding SEQ ID NO:9, under 
conditions of medium to high stringency, or (iii) being complementary to the polypeptide 
sequence provided in SEQ ID NO:9. In further embodiments, the present invention provides 
at vectors comprising the polynucleotide described above. In yet additional embodiments, a 
host cell is provided that is transformed with the vector. 

The present invention also provides methods for producing proteases, comprising: 
(a) transforming a host cell with an expression vector comprising a polynucleotide having at 
least 70% sequence identity to SEQ ID NO:4, at least 95% sequence identity to SEQ ID 
NO:4, and/or having a polynucleotide sequence of SEQ ID NO:4; (b) cultivating the 
transformed host cell under conditions suitable for the host cell to produce the protease; and 

(c) recovering the protease. In some embodiments, the host cell is a Bacillus species 
(e.g., ft. subtilis, ft clausii, or ft licheniformis). In alternative embodiments, the host cell is a 
Streptomyces spp., (e.g., Streptomyces lividans). In additional embodiments, the host cell 
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is a Trichoderma spp., (e.g., Trichoderma reesei). In yet further embodiments, the host cell 
is a Aspergillus spp. {e.g., Aspergillus niget). 

As will be appreciated, an advantage of the present invention is that a polynucleotide 
has been isolated which provides the capability of isolating further polynucleotides which 

5 encode proteins having serine protease activity, wherein the backbone is substantially 
identical to that of the Cellulomonas protease of the present invention. 

In further embodiments, the present invention provides means to produce host cells 
that are capable of producing the serine proteases of the present invention in relatively large 
quantities. In particularly preferred embodiments, the present invention provides means to 

10 produce protease with various commercial applications where degradation or synthesis of 
polypeptides are desired, including cleaning compositions, as well as feed components, 
textile processing, leather finishing, grain processing, meat processing, cleaning, 
preparation of protein hydrolysates, digestive aids, microbicidal compositions, bacteriostatic 
composition, fungistatic compositions, personal care products, including oral care, hair care, 

is and/or skin care. 

The present invention further provides enzyme compositions have comparable or 
improved wash performance, as compared to presently used subtilisin proteases. Other 
objects and advantages of the present invention are apparent from the present 
Specification. 

20 

The present invention provides an isolated polypeptide having proteolytic activity, 
{e.g., a protease) having the amino acid sequence set forth in SEQ ID NO:8. In some 
embodiments, the present invention provides isolated polypeptides having approximately 
40% to 98% identity with the sequence set forth in SEQ ID NO:8. In some preferred 

25 embodiments, the polypeptides have approximately 50% to 95% identity with the sequence 
set forth in SEQ ID NO:8. In some additional preferred embodiments, the polypeptides have 
approximately 60% to 90% identity with the sequence set forth in SEQ ID NO:8. In yet 
additional embodiments, the polypeptides have approximately 65% to 85% identity with the 
sequence set forth in SEQ ID NO:8. In some particularly preferred embodiments, the 

30 polypeptides have approximately 90% to 95% identity with the sequence set forth in SEQ ID 
NO:8. 

The present invention further provides proteases obtained from bacteria of the 
suborder Micrococcineae. In some preferred embodiments, the proteases are obtained 
from members of the family Promicromonosporaceae. In yet further embodiments, the 
35 proteases are obtained from any member of the genera Xylanimicrobium, Xylanibacterium, 
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Xylanimonas, Myceligenerans, and Promicromonospora. In some preferred embodiments, 
the proteases are obtained from members of the family Cellulomonadaceae. In some 
particularly preferred embodiments, the proteases are obtained from members of the genera 
Cellulomonas and Oerskovia. In some further preferred embodiments, the proteases are 
derived from Cellulomonas spp. In some embodiments, the Cellulomonas spp. is selected 
from Cellulomonas fimi, Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas 
hominis, Cellulomonas flavigena, Cellulomonas persica, Cellulomonas iranensis, 
Cellulomonas gelida, Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, 
Cellulomonas fermentans, Cellulomonas xylanilytica, Cellulomonas humilata and 
Cellulomonas strain 69B4 (DSM 1 6035). 

In alternative embodiments, the proteases are derived from Oerskovia spp. In some 
preferred embodiments, the Oerskovia spp. is selected from Oerskovia jenensis, Oerskovia 
paurometabola, Oerskovia enterophila f Oerskovia turbata and Oerskovia turbata strain DSM 
20577. 

In some embodiments, the proteases have apparent molecular weights of about 
17kD to 21 kD as determined by a matrix assisted laser desorption/ionizaton - time of flight 
("MALDI-TOP) spectrophotometer. 

The present invention further provides isolated polynucleotides that encode 
proteases comprise an amino acid sequence comprising at least 40% amino acid sequence 
identity to SEQ ID NO:8. In some embodiments, the proteases have at least 50% amino 
acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 
60% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases 
have at least 70% amino acid sequence identity to SEQ ID NO:8. In some embodiments, 
the proteases have at least 80% amino acid sequence identity to SEQ ID NO:8. In some 
embodiments, the proteases have at least 90% amino acid sequence identity to SEQ ID 
NO:8. In some embodiments, the proteases have at least 95% amino acid sequence 
identity to SEQ ID NO:8. The present invention also provides expression vectors comprising 
any of the polynucleotides provided above. 

The present invention further provides host cells transformed with the expression 
vectors of the present invention, such that at least one protease is expressed by the host 
cells. In some embodiments, the host cells are bacteria, while in other embodiments, the 
host cells are fungi. In some preferred embodiments, the bacterial host cells are selected 
from the group consisting of the genera Bacillus and Streptomyces. In some alternative 
preferred embodiments, the fungal host cells are members of the genus Trichoderma, while 
in other alternative preferred embodiments, the fungal host cells are members of the genus 
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Aspergillus. 

The present invention also provides isolated polynucleotides comprising a nucleotide 
sequence (i) having at least 70% identity to SEQ ID NOS:3 or 4, or (ii) being capable of 
hybridizing to a probe derived from the nucleotide sequence disclosed in SEQ ID NOS: 3 or 
4, under conditions of medium to high stringency, or (Hi) being complementary to the 
nucleotide sequence disclosed in SEQ ID NOS:3 or 4. In some embodiments, the present 
invention provides vectors comprising such polynucleotide. In further embodiments, the 
present invention provides host cells transformed with such vector. 

The present invention further provides methods for producing at least one enzyme 
having protease activity, comprising: the steps of transforming a host cell with an expression 
vector comprising a polynucleotide comprising at least 70% sequence identity to SEQ ID 
NO:4, cultivating the transformed host cell under conditions suitable for the host cell to 
produce the protease; and recovering the protease. In some preferred embodiments, the 
host cell is a Streptomyces spp, while in other embodiments, the host cell is a Bacillus spp„ 
a Trichoderma spp., and/or a Aspergillus spp. In some embodiments, the Streptomyces 
spp. is Streptomyces lividans. In alternative embodiments, the host cell is T. reesei. In 
further embodiments, the Aspergillus spp. is A. niger. 

The present invention also provides fragments (i.e., portions) of the DNA encoding 
the proteases provided herein; These fragments find use in obtaining partial length DNA 
fragments capable of being used to isolate or identify polynucleotides encoding mature 
protease enzyme described herein from Cellulomonas 69B4, or a segment thereof having 
proteolytic activity. In some embodiments, portions of the DNA provided in SEQ ID NO:1 
find use in obtaining homologous fragments of DNA from other species, and particularly 
from Micrococcineae spp. which encode a protease or portion thereof having proteolytic 
activity. 

The present invention further provides at least one probe comprising a 
polynucleotide substantially identical to a fragment of SEQ ID NOS:1 , 2, 3 or 4, wherein the 
probe is used to detect a nucleic acid sequence coding for an enzyme having proteolytic 
activity, and wherein the nucleic acid sequence is obtained from a bacterial source. In some 
embodiments, the bacterial source is a Cellulomonas spp. In some preferred embodiments, 
the bacterial source is Cellulomonas strain 69B4. 

The present invention further provides compositions comprising at least one of the 
proteases provided herein. In some preferred embodiments, the compositions are cleaning 
compositions. In some embodiments, the present invention provides cleaning compositions 
comprising a cleaning effective amount of at least one protease comprising an amino acid 
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sequence having at least 40% sequence identity to SEQ ID NO:8, at least 90% sequence 
identity to SEQ ID NO:8, and/or having an amino acid sequence of SEQ ID NO:8. In some 
embodiments, the cleaning compositions further comprise at least one suitable cleaning 
adjunct. In some embodiments, the protease is derived from a Cellulomonas sp. In some 
preferred embodiments, the Cellulomonas spp. is selected from Cellulomonas fimi, 
Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas hominis, Cellulomonas 
flavigena, Cellulomonas persica, Cellulomonas iranensis, Cellulomonas gelida, 
Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda 9 and Cellulomonas strain 
69B4 (DSM 16035). In some particularly preferred embodiments, the Cellulomonas spp is 
Cellulomonas. strain 69B4. In still further embodiments, the cleaning composition further 
comprises at least one additional enzymes or enzyme derivatives selected from the group 
consisting of protease, amylase, lipase, mannanase and cellulase. • 

The present invention also provides isolated naturally occurring proteases 
comprising an amino acid sequence having at least 45% sequence identity to SEQ ID NO:8, 
at least 60% sequence identity to SEQ ID NO:8, at least 75% sequence identity to SEQ ID 
NO:8, at least 90% sequence identity to SEQ ID NO:8, at least 95% sequence identity to 
SEQ ID NO:8, and/or having the sequence identity of SEQ ID NO:8, the protease being 
isolated from a Cellulomonas spp.. In some embodiments, the protease is isolated from 
Cellulomonas strain 69B4 (DSM 16035). 

In additional embodiments, the present invention provides engineered variants of the 
serine proteases of the present invention. In some embodiments, the engineered variants 
are genetically modified using recombinant DNA technologies, while in other embodiments, 
the variants are naturally occurring. The present invention further encompasses engineered 
variants of homologous enzymes. In some embodiments, the engineered variant 
homologous proteases are genetically modified using recombinant DNA technologies, while 
in other embodiments, the variant homologous proteases are naturally occurring. 

The present invention also provides serine proteases that immunologically cross- 
react with the ASP protease of the present invention. Indeed, it is intended that the present 
invention encompass fragments (e.g., epitopes) of the ASP protease that stimulate an 
immune response in animals (including, but not limited to humans) and/or are recognized by 
antibodies of any class. The present invention further encompasses epitopes on proteases 
that are cross-reactive with ASP epitopes. In some embodiments, the ASP epitopes are 
recognized by antibodies, but do not stimulate an immune response in animals (including, 
but not limited to humans), while in other embodiments, the ASP epitopes stimulate an 
immune response in at least one animal species (including, but not limited to humans) and 
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are recognized by antibodies of any class. The present invention also provides means and 
compositions for identifying and assessing cross-reactive epitopes. 

The present invention further provides at least one polynucleotide encoding a signal 
peptide (i) having at least 70% sequence identity to SEQ ID NO:9, or (ii) being capable of 
hybridizing to a probe derived from the polypeptide sequence encoding SEQ ID NO:9, under 
conditions of medium to high stringency, or (iii) being complementary to the polypeptide 
sequence provided in SEQ ID NO:9. In further embodiments, the present invention provides 
at vectors comprising the polynucleotide described above. In yet additional embodiments, a 
host cell is provided that is transformed with the vector. 

The present invention also provides methods for producing proteases, comprising: 
(a) transforming a host cell with an expression vector comprising a polynucleotide having at 
least 70% sequence identity to SEQ ID NO:4, at least 95% sequence identity to SEQ ID 
NO:4, and/or having a polynucleotide sequence of SEQ ID NO:4; (b) cultivating the 
transformed host cell under conditions suitable for the host cell to produce the protease; and 

(c) recovering the protease. In some embodiments, the host cell is a Bacillus species 
{e.g., B. subtilis, B. clausii, or B. Hcheniformis). In alternative embodiments, the host cell is a 
Streptomyces spp., (e.g., Streptomyces lividans). In additional embodiments, the host cell 
is a Trichoderma spp. f (e.g., Trichoderma reesei). In yet further embodiments, the host cell 
is a Aspergillus spp. t (e.g., Aspergillus nigei). 

As will be appreciated, an advantage of the present invention is that a polynucleotide 
has been isolated which provides the capability of isolating further polynucleotides which 
encode proteins having serine protease activity, wherein the backbone is substantially 
identical to that of the Cellulomonas protease of the invention. 

In further embodiments, the present invention provides means to produce host cells 
that are capable of producing the serine proteases of the present invention in relatively large 
quantities. In particularly preferred embodiments, the present invention provides means to 
produce protease with various commercial applications where degradation or synthesis of 
polypeptides are desired, including cleaning compositions, as well as feed components, 
textile processing, leather finishing, grain processing, meat processing, cleaning, 
preparation of protein hydrolysates, digestive aids, microbicidal compositions, bacteriostatic 
composition, fungistatic compositions, personal care products, including oral care, hair care,' 
and/or skin care. 

The present invention further provides enzyme compositions have comparable or 
improved wash performance, as compared to presently used subtilisin proteases. Other 
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objects and advantages of the present invention are apparent from the present 
Specification. 

DESCRIPTION OF THE FIGURES 

Figure 1 provides an unrooted phylogenetic tree illustrating the relationship of novel 
strain 69B4 to members of the family Cellulomonadaceae and other related genera of the 
suborder Micrococcineae. 

Figure 2 provides a phylogenetic tree for ASP protease. 

Figure 3 provides a MALDI TOF spectrum of a protease derived from Cellulomonas 
strain 69B4 

Figure 4 shows the sequence of N-terminal most tryptic peptide from C. flavigena 

Figure 5 provides the plasmid map of the pSEGCT vector. 

Figure 6 provides the plasmid map of the pSEGCT69B4 vector. 

Figure 7 provides the plasmid map of the pSEA469BCT vector. 

Figure 8 provides the plasmid map of the pHPLT-Asp-C1-1 vector. 

Figure 9 provides the plasmid map of the pHPLT-Asp-C1-2 vector. 

Figure 10 provides the plasmid map of the pHPLT-Asp-C2-1 vector. 

Figure 1 1 provides the plasmid map of the pHPLT-Asp-C2-2 vector. 

Figure 12 provides the plasmid map of the pHPLT-ASP-lll vector. 

Figure 13 provides the plasmid map of the pHPLT-ASP-IV vector. 

Figure 14 provides the plasmid map of the pHPLT-ASP-VII vector. 

Figure 15 provides the plasmid map of the pXX-Kpnl vector. 

Figure 16 provides the plasmid map of the p2JM103-DNNP1 vector. 

Figure 17 provides the plasmid map of the pHPLT vector. 

Figure 18 provides the map and MXL-prom sequences for the opened pHPLT-ASP- 

C1-2. 

Figure 1 9 provides the plasmid map of the pENMx3 vector. 
Figure 20 provides the plasmid map of the pICatH vector. 

Figure 21 provides the plasmid map of the pTREX4 vector. 

Figure 22 provides the plasmid map of the pSLGAMpR2 vector. 

Figure 23 provides the plasmid map of the pRAXdes2-ASP vector. 

Figure 28 provides the plasmid map of the pAPDI vector. 

Figure 25 provides graphs showing ASP autolysis. Panel A provides a graph 
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showing the ASP autolysis peptides observed in a buffer without LAS. Panel B provides a 
graph showing the ASP autolysis peptides observed in a buffer with 0.1% LAS. 

Figure 26 compares the cleaning activity (absorbance at 405 nm) dose (ppm) 
response curves of certain serine proteases (69B4 [-*•]; PURAFECT® [-♦-]; RELASE™ [- 
s A?]; and OPTIMASE™ [-■-] in liquid TIDE® detergent under North American wash 
conditions. 

Figure 27 provides a graph that compares the cleaning activity (absorbance at 405 
nm) dose (ppm) response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-♦- 
]; RELASE™ [-A-]; and OPTIMASE™ [-■-] in Detergent Composition III powder detergent 
10 (0.66 g/l) North American concentration/detergent formulation under Japanese wash 
conditions. 

Figure 28 provides a graph that compares the cleaning activity (absorbance at 405 
nm) dose (ppm) response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-♦- 
]; RELASE™ [- A-]; and OPTIMASE™ [-■-] in ARIEL® REGULAR detergent powder under 
is European wash conditions. 

Figure 29 provides a graph that compares the cleaning activity (absorbance at 405 
nm) dose (ppm) response curves of certain serine protease (69B4 [-x-]; PURAFECT® [-♦-]; 
RELASE™ [-A-]; and OPTIMASE™ [-■-] in PURE CLEAN detergent powder under 
Japanese conditions. 

20 Figure 30 provides a graph that compares the cleaning activity (absorbance at 405 

nm) dose (ppm) response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-♦- 
]; RELASE™ [-A-]; and OPTIMASE™ [-■-] in Detergent Composition III powder (1 .00 g/l) 
under North American conditions. 

Figure 31 provides a graph that shows comparative oxidative inactivation of various 

25 serine proteases (100 ppm) as a measure of per cent enzyme activity over time (minutes) 
(69B4 [-x-J; BPN' variant 1 [-♦- ]; PURAFECT® [-A-]; and GG36-variant 1 [•■-]) : wrth 0.1 M 
H 2 0 2 at pH 9.45, 25°C. 

Figure 32 provides a graph that shows comparative chelator inactivation of various 
serine proteases (100 ppm) as a measure of per cent enzyme activity over time (minutes) 

so (69B4 [-x-]; BPN'-variant 1 [-♦- ]; PURAFECT® [-A-]; and GG36-variant 1 [-■-] with 10mM 
EDTAat pH 8.20, 45 # C. 

Figure 33 provides a graph that shows comparative thermal inactivation of various 
serine proteases (100 ppm) as a measure of percent enzyme activity over time (minutes) 
(69B4 [-x-]; BPN'-variant [-♦-]; PURAFECT® [-A-]; and GG36-variant 1 [-■-] with 50 mM 
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Tris at pH 8.0, 45°C. 

Figure 34 provides a graph that shows comparative thermal inactivation of certain 
serine proteases (69B4 [-x-]; BPN'-variant [-♦- ]; PURAFECT® [-A-]; and GG36-variant-i [- 
■-] at pH 8.60, over a temperature gradient of 57°C to 62'C. 
5 Figure 35 provides a graph that shows enzyme activity (hydrolysis of di-methyl 

casein measured by absorbance at 405 nm) of certain serine proteases (2.5 ppm) (69B4 [- 
■- ]; BPN'-variant [-♦- PURAFECT® [-A-]; and GG36-variant 1[ -] at pH 's ranging from 
5to12at37°C. 

Figure 36 provides a bar graph that shows enzyme stability as indicated by % 
10 remaining activity (hydrolysis of di-methyl casein measured by absorbance at 405 nm) of 
certain serine proteases (2.5 ppm) (69B4, BPN'- variant; PURAFECT® and GG36-variant 1 
at pHs ranging from 3 (| ), 4 (g] ), 5 ( § ) to 6 ( ^ ) at 25°, 35 # , and 45°C., 
respectively. 

Figure 37 provides a graph that shows enzyme stability as indicated by % remaining 
is activity of a BPN'-variant at pH ranges from 3 (-♦-), 4 (--■--), 5 ( -A— ) to 6 (-X-) at 25°, 
35\ and 45*C, respectively 

Figure 38 provides a graph that shows enzyme stability as indicated by % remaining 
activity of PURAFECT® TM protease at pH ranges from 3 (-♦- ), 4 (--■-), 5 (- A-- ) to 6 (-- 
X-) at 25*, 35°, and 45*C., respectively 
20 Figure 39 provides a graph that shows enzyme stability as indicated by % remaining 

activity of 69B4 protease at pH ranges from 3 (-♦-), 4 (-■-), 5 ( - A- ) to 6 (-X-) at 25 \ 
35 # and 45'C, respectively 

DESCRIPTION OF THE INVENTION 

25 The present invention provides novel serine proteases, novel genetic material 

encoding these enzymes, and proteolytic proteins obtained from Micrococcineae spp., 
including but not limited to Cellulomonas spp. and variant proteins developed therefrom. In 
particular, the present invention provides protease compositions obtained from a 
Cellulomonas spp, DNA encoding the protease, vectors comprising the DNA encoding the 

ao protease, host cells transformed with the vector DNA, and an enzyme produced by the host 
cells. The present invention also provides cleaning compositions (e.g., detergent 
compositions), animal feed compositions, and textile and leather processing compositions 
comprising protease(s) obtained from a Micrococcineae spp., including but not limited to 
Cellulomonas spp. In alternative embodiments, the present invention provides mutant (i.e., 

35 variant) proteases derived from the wild-type proteases described herein. These mutant 
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proteases also find use in numerous applications! 

Gram-positive alkalophilic bacteria have been isolated from in and around alkaline 
soda lakes (See e.g., U.S. Pat. No. 5,401,657, herein incorporated by reference). These 
alkalophilic were analyzed according to the principles of numerical taxonomy with respect to 
each other and also a collection of known bacteria, and taxonomically characterized. Six 
natural clusters or phenons of alkalophilic bacteria were generated. Amongst the strains 
isolated was a strain identified as 69B4. 

Cellulomonas spp. are Gram-positive bacteria classified as members of the family 
Cellulomonadaceae, Suborder Micrococcineae, Order Actinomycetales, Class 
Actinobacteria. Cellulomonas grows as slender, often irregular rods that may occasionally 
show branching, but no mycelium is formed. In addition, there is no aerial growth and no 
spores are formed. Cellulomonas and Streptomyces are only distantly related at a genetic 
level. The large genetic (genomic) distinction between Cellulomonas and Streptomyces is 
reflected in a great difference in phenotypic properties. While serine proteases in 
Streptomyces have been previously examined, there apparently have been no reports of 
any serine proteases (approx. MW 18,000 to 20,000) secreted by Cellulomonas spp: In 
addition, there apparently have been no previous reports of Cellulomonas proteases being 
used in the cleaning and/or feed industry. 

Streptomyces are Gram-positive bacteria classified as members of the Family 
Streptomycetaceae, Suborder Streptomycineae, Order Actinomycetales, class 
Actinobacteria. Streptomyces grows as an extensively branching primary or substrate 
mycelium and an abundant aerial mycelium that at maturity bear characteristic spores. 
Streptogrisins are serine proteases secreted in large amounts from a wide variety of 
Streptomyces species. The amino acid sequences of Streptomyces proteases have been 
determined from at least 9 different species of Streptomyces including Streptomyces griseus 
Streptogrisin C (accession no. P52320); alkaline proteinase (EC 3.4.21.-) from 
Streptomyces sp. (accession no. PC2053); alkaline serine proteinase I from Streptomyces 
sp. (accession no. S34672), serine protease from Streptomyces lividans (accession no. 
CAD4208); putative serine protease from Streptomyces coelicolor A3(2) (accession no. 
NP_625129); putative serine protease from Streptomyces avermitilis MA-4680 (accession 
no. NP_822175); serine protease from Streptomyces lividans (accession no. CAD42809); 
putative serine protease precursor from Streptomyces coelicolor A3(2) (accession no. 
NP_628830)). A purified native alkaline protease having an apparent molecular weight of 
19,000 daltons and isolated from Streptomyces griseus var. alcalophilus protease and 
cleaning compositions comprised thereof have been described (See e.g., U.S. Patent No. 
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5,646,028, incorporated herein by reference). 

The present invention provides protease enzymes produced by these organisms. 
Importantly, these enzymes have good stability and proteolytic activity. These enzymes find 
use in various applications, including but not limited to cleaning compositions, animal feed, 

5 textile processing and etc. The present invention also provides means to produce these 
enzymes. In some preferred embodiments, the proteases of the present invention are in 
pure or relatively pure form. 

The present invention also provides nucleotide sequences which are suitable to 
produce the proteases of the present invention in recombinant organisms. In some 

10 embodiments, recombinant production provides means to produce the proteases in 
quantities that are commercially viable. 

Unless otherwise indicated, the practice of the present invention involves 
conventional techniques commonly used in molecular biology, microbiology, and 
recombinant DNA, which are within the skill of the art. Such techniques are known to those 

15 of skill in the art and are described in numerous texts and reference works (See e.g., 
Sambrook et ai, "Molecular Cloning: A Laboratory Manual", Second Edition (Cold Spring 
Harbor), [1989]); and Ausubel et ai, "Current Protocols in Molecular Biology" [1987]). All 
patents, patent applications, articles and publications mentioned herein, both supra and 
infra, are hereby expressly incorporated herein by reference. 

20 Unless defined otherwise herein, all technical and scientific terms used herein have 

the same meaning as commonly understood by one of ordinary skill in the art to which this 
invention pertains. For example, Singleton and Sainsbury, Dictionary of Microbiology and 
Molecular Biology, 2d Ed., John Wiley apd Sons, NY (1994); and Hale and Marham, The 
Harper Collins Dictionary of Biology, Harper Perennial, NY (1991) provide those of skill in 

25 the art with a general dictionaries of many of the terms used in the invention. Although any 
methods and materials similar or equivalent to those described herein find use in the 
practice of the present invention, the preferred methods and materials are described herein. 
Accordingly, the terms defined immediately below are more fully described by reference to 
the Specification as a whole. Also, as used herein, the singular "a", "an" and "the" includes 

30 the plural reference unless the context clearly indicates otherwise. Numeric ranges are 
inclusive of the numbers defining the range. Unless otherwise indicated, nucleic acids are 
written left to right in 5' to 3' orientation; amino acid sequences are written left to right in 
amino to carboxy orientation, respectively. It is to be understood that this invention is not 
limited to the particular methodology, protocols, and reagents described, as these may vary, 
35 depending upon the context they are used by those of skill in the art. 
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The practice of the present invention employs, unless otherwise indicated, 
conventional techniques of protein purification, molecular biology, microbiology, recombinant 
DNA techniques and protein sequencing, all of which are within the skill of those in the art. 

Furthermore, the headings provided herein are not limitations of the various aspects 
or embodiments of the invention which can be had by reference to the specification as a 
whole. Accordingly, the terms defined immediately below are more fully defined by 
reference to the specification as a whole. Nonetheless, in order to facilitate understanding 
of the invention, a number of terms are defined below. 

I. Definitions 

As used herein, the terms "protease," and "proteolytic activity" refer to a protein or 
peptide exhibiting the ability to hydrolyze peptides or substrates having peptide linkages. 
Many well known procedures exist for measuring proteolytic activity (Kalisz, "Microbial 
Proteinases," In: Fiechter (ed.), Advances in Biochemical Enaineerinq/BiotechnoloQV , 
[1988]). For example, proteolytic activity may be ascertained by comparative assays which 
analyze the respective proteased ability to hydrolyze a commercial substrate. Exemplary 
substrates useful in the such analysis of protease or protelytic activity, include, but are not 
limited to di-methyl casein (Sigma C-9801), bovine collagen (Sigma C-9879), bovine elastin 
(Sigma E-1625), and bovine keratin (ICN Biomedical 902111). Colorimetric assays utilizing 
these substrates are well known in the art (See e.g., WO 99/3401 1 ; and U.S. Pat. No. 
6,376,450, both of which are incorporated herein by reference. The pNA assay (See e.g., 
Del Mar et al, Anal. Biochem., 99:316-320 [1979]) also finds use in determining the active 
enzyme concentration for fractions collected during gradient elution. This assay measures 
the rate at which p-nitroaniline is released as the enzyme hydrolyzes the soluble synthetic 
substrate, succinyl-alanine-alanine-proline-phenylalanine-p-nitroanilide (sAAPF-pNA). The 
rate of production of yellow color from the hydrolysis reaction is measured at 410 nm on a 
spectrophotometer and is proportional to the active enzyme concentration. In addition, 
absorbance measurements at 280 nm can be used to determine the total protein 
concentration. The active enzyme/total-prptein ratio gives the enzyme purity. 

As used herein, the terms "ASP protease," "Asp protease," and "Asp," refer to the 
serine proteases described herein. In some preferred embodiments, the Asp protease is 
the protease designed herein as 69B4 protease obtained from Cellulomonas strain 69B4. 
Thus, in preferred embodiments, the term "69B4 protease" refers to a naturally occurring 
mature protease derived from Cellulomonas strain 69B4 (DSM 16035) having substantially 
identical amino acid sequences as provided in SEQ ID NO:8. In alternative embodiments, 
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the present invention provides portions of the ASP protease. 

The term "Cellulomonas protease homologues" refers to naturally occurring 
proteases having substantially identical amino acid sequences to the mature protease 
derived from Cellulomonas strain 69B4 or polynucleotide sequences which encode for such 
5 naturally occurring proteases, and which proteases retain the functional characteristics of a 
serine protease encoded by such nucleic acids. In some embodiments, these protease 
homologues are referred to as "cellulomonadins.'' 

As used herein, the terms "protease variant," "ASP variant," "ASP protease variant," 
and "69B protease varianf are used in reference to proteases that are similar to the wild- 
10 type ASP, particularly in their function, but have mutations in their amino acid sequence that 
make them different in sequence from the wild-type protease. 

As used herein, "Cellulomonas ssp." refers to all of the species within the genus 
"Cellulomonas" which are Gram-positive bacteria classified as members of the Family 
Cellulomonadaceae, Suborder Micrococcineae, Order Actinomycetales, Class 
15 Actinobacteria. It is recognized that the genus Cellulomonas continues to undergo 

taxonomical reorganization. Thus, it is intended that the genus include species that have 
been reclassified 

As used herein, "Streptomyces ssp." refers to all of the species within the genus 
"Streptomyces" which are Gram-positive bacteria classified as members of the Family 
20 Streptomycetaceae, Suborder Streptomycineae t Order Actinomycetales, class 
Actinobacteria. It is recognized that the genus Streptomyces continues to undergo 
taxonomical reorganization. Thus, it is intended that the genus include species that have 
been reclassified 

As used herein, "the genus Bacillus includes all species within the genus "Bacillus" 
25 as known to those of skill in the art, including but not limited to B. subtilis, B. licheniformis, B. 
lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. 
halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis. It is 
recognized that the genus Bacillus continues to undergo taxonomical reorganization. Thus, 
it is intended that the genus include species that have been reclassified, including but not 
30 limited to such organisms as B. stearothermophilus, which is now named "Geobacillus 
stearothermophilus" The production of resistant endospores in the presence of oxygen is 
considered the defining feature of the genus Bacillus, although this characteristic also 
applies to the recently named Alicyclobacillus, Amphibacillus, Aneurinibacillus, 
Anoxybacillus 9 Brevibacillus, Filobacillus t Gracilibacillus, Halobacillus, Paenibacillus, 
35 Salibacillus, Thermobacillus t Ureibacitlus, and Wgibacillus. 
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The terms "polynucleotide" and "nucleic acid", used interchangeably herein, refer to 
a polymeric form of nucleotides of any length, either ribonucleotides or 
deoxyribonucleotides. These terms include, but are not limited to, a single-, double- or 
triple-stranded DNA,-genomic DNA, cDNA, RNA, DNA-RNA hybrid, or a polymer comprising 
purine and pyrimidine bases, or other natural, chemically, biochemically modified, non- 
natural or derivatized nucleotide bases. The following are non-limiting examples of 
polynucleotides: genes, gene fragments, chromosomal fragments, ESTs, exons, introns, 
mRNA, tRNA, rRNA, ribozymes, cDNA, recombinant polynucleotides, branched 
polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any 
sequence, nucleic acid probes, and primers. In some embodiments, polynucleotides 
comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs, 
uracil, other sugars and linking groups such as fluororibose and thioate, and nucleotide 
branches. In alternative embodiments, the sequence of. nucleotides is interrupted by non- 
nucleotide components. 

As used herein, the terms "DNA construcf and "transforming DNA" are used 
interchangeably to refer to DNA used to introduce sequences into a host cell or organism. 
The DNA may be generated in vitro by PCR or any other suitable technique(s) known to 
those in the art. In particularly preferred embodiments, the DNA construct comprises a 
sequence of interest (e.g., as an incoming sequence). In some embodiments, the sequence 
is operably linked to additional elements such as control elements (e.g., promoters, etc.). 
The DNA construct may further comprise a selectable marker. It may further comprise an 
incoming sequence flanked by homology boxes. In a further embodiment, the transforming 
DNA comprises other non-homologous sequences, added to the ends (e.g., stuffer 
sequences or flanks). In some embodiments, the ends of the incoming sequence are 
closed such that the transforming DNA forms a closed circle. The transforming sequences 
may be wild-type, mutant or modified. In some embodiments, the DNA construct comprises 
sequences homologous to the host cell chromosome. In other embodiments, the DNA 
construct comprises non-homologous sequences. Once the DNA construct is assembled in 
vitro it may be used to: 1) insert heterologous sequences into a desired target sequence of 
a host cell, and/or 2) mutagenize a region of the host cell chromosome (i.e., replace an 
endogenous sequence with a heterologous sequence), 3) delete target genes; and/or 
introduce a replicating plasmid into the host. 

As used herein, the terms "expression cassette" and "expression vector" refer to 
nucleic acid constructs generated recombinantly or synthetically, with a series of specified 
nucleic acid elements that permit transcription of a particular nucleic acid in a target cell. 
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The recombinant expression cassette can be incorporated into a plasmid, chromosome, 
mitochondrial DNA, plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant 
expression cassette portion of an expression vector includes, among other sequences, a 
nucleic acid sequence to be transcribed and a promoter. In preferred embodiments, 
expression vectors have the ability to incorporate and express heterologous DNA fragments 
in a host cell. Many prokaryotic and eukaryotic expression vectors are commercially 
available. Selection of appropriate expression vectors is within the knowledge of those of 
skill in the art. The term "expression cassette" is used interchangeably herein with "DNA 
construct," and their grammatical equivalents. Selection of appropriate expression vectors is 
within the knowledge of those of skill in the art. 

As used herein, the term "vector" refers to a polynucleotide construct designed to 
introduce nucleic acids into one or more cell types. Vectors include cloning vectors, 
expression vectors, shuttle vectors, plasmids, cassettes and the like. In some 
embodiments, the polynucleotide construct comprises a DNA sequence encoding the 
protease {e.g., precursor or mature protease) that is operably linked to a suitable 
prosequence (e.g., secretory, etc.) capable of effecting the expression of the DNA in a 
suitable host. 

As used herein, the term "plasmid" refers to a circular double-stranded (ds) DNA 
construct used as a cloning vector, and which forms an extrachromosomal self-replicating 
genetic element in some eukaryotes or prokaryotes, or integrates into the host 
chromosome. 

As used herein in the context of introducing a nucleic acid sequence into a cell, the 
term "introduced" refers to any method suitable for transferring the nucleic acid sequence 
into the celL Such methods for introduction include but are not limited to protoplast fusion, 
transfection, transformation, conjugation, and transduction {See e.g., Ferrari etal., 
"Genetics," in Hardwood etal, (eds.), Bacillus . Plenum Publishing Corp., pages 57-72, 
[1989]). 

As used herein, the terms "transformed" and "stably transformed" refers to a cell that 
has a non-native (heterologous) polynucleotide sequence integrated into its genome or as 
an episomal plasmid that is maintained for at least two generations. 

As used herein, the term "selectable marker-encoding nucleotide sequence" refers to 
a nucleotide sequence which is capable of expression in the host cells and where 
expression of the selectable marker confers to cells containing the expressed gene the 
ability to grow in the presence of a corresponding selective agent or lack of an essential 
nutrient. 
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As used herein, the terms "selectable marker" and "selective marker" refer to a 
nucleic acid {e.g., a gene) capable of expression in host cell which allows for ease of 
selection of those hosts containing the vector. Examples of such selectable markers include 
but are not limited to antimicrobials. Thus, the term "selectable marker" refers to genes that 

5 provide an indication that a host cell has taken up an incoming DNA of interest or some 
other reaction has occurred. Typically, selectable markers are genes that confer 
antimicrobial resistance or a metabolic advantage on the host cell to allow cells containing 
the exogenous DNA to be distinguished from cells that have not received any exogenous 
sequence during the transformation. A "residing selectable marker" is one that is located on 

10 the chromosome of the microorganism to be transformed. A residing selectable marker 
encodes a gene that is different from the selectable marker on the transforming DNA 
construct. Selective markers are well known to those of skill in the art. As indicated above, 
preferably the marker is an antimicrobial resistant marker (e.g., amp R ; phleo R ; spec R ; kan R ; 
ery R ; tet R ; cmp R ; and neo R ; See e.g., Guerot-Fleury, Gene, 167:335-337 [1995]; Palmeros 

is etal., Gene 247:255-264 [2000]; and Trieu-Cuot etai, Gene, 23:331-341 [1983]). Other 
markers useful in accordance with the invention include, but are not limited to auxotrophic 
markers, such as tryptophan; and detection markers, such as p- galactosidase. 

As used herein, the term "promoter" refers to a nucleic acid sequence that functions 
to direct transcription of a downstream gene. In preferred embodiments, the promoter is 

20 appropriate to the host cell in which the target gene is being expressed. The promoter, 
together with other transcriptional and translational regulatory nucleic acid sequences (also 
termed "control sequences") is necessary to express a given gene. In general, the 
transcriptional and translational regulatory sequences include, but are not limited to, 
promoter sequences, ribosomal binding sites, transcriptional start and stop sequences, 

25 translational start and stop sequences, and enhancer or activator sequences. 

A nucleic acid is "operably linked" when it is placed into a functional relationship with 
another nucleic acid sequence. For example, DNA encoding a secretory leader (i.e., a 
signal peptide), is operably linked to DNA for a polypeptide if it is expressed as a preprotein 
that participates in the secretion of the polypeptide; a promoter or enhancer is operably 

30 linked to a coding sequence if it affects the transcription of the sequence; or a ribosome 
binding site is operably linked to a coding sequence if it is positioned so as to facilitate 
translation. Generally, "operably linked" means that the DNA sequences being linked are 
contiguous, and, in the case of a secretory leader, contiguous and in reading phase. 
However, enhancers do not have to be contiguous. Linking is accomplished by ligation at 

35 convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors 



WO 2005/052146 



PCT/US2004/039066 



-35- 

or linkers are used in accordance with conventional practice. 

As used herein the term "gene" refers to a polynucleotide (e.g., a DNA segment), 
that encodes a polypeptide and includes regions preceding and following the coding regions 
as well as intervening sequences (introns) between individual coding segments (exons). 

As used herein, "homologous genes" refers to a pair of genes from different, but 
usually related species, which correspond to each other and which are identical or very 
similar to each other. The term encompasses genes that are separated by speciation (i.e., 
the development of new species) (e.g., orthologous genes), as well as genes that have been 
separated by genetic duplication (e.g., paralogous genes). 

As used herein, "ortholog" and "orthologous genes" refer to genes in different 
species that have evolved from a common ancestral gene (i.e., a homologous gene) by 
speciation. Typically, orthologs retain the same function during the course of evolution. 
Identification of orthologs finds use in the reliable prediction of gene function in newly 
sequenced genomes. 

As used herein, "paralog" and "paralogous genes* refer to genes that are related by 
duplication within a genome. While orthologs retain the same function through the course of 
evolution, paralogs evolve new functions, even though some functions are often related to 
the original one. Examples of paralogous genes include, but are not limited to genes 
encoding trypsin, chymotrypsin, elastase, and thrombin, which are all serine proteinases and 
occur together within the same species. 

As used herein, "homology" refers to sequence similarity or identity, with identity 
being preferred. This homology is determined using standard techniques known in the art 
(See e.g., Smith and Waterman, Adv. Appl. Math., 2:482 [1981]; Needleman and Wunsch, 
J. Mol. Biol., 48:443 [1970]; Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85:2444 
[1988]; programs such as GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics 
Software Package (Genetics Computer Group, Madison, Wl); and Devereux et a/., Nucl. 
Acid Res., 1 2:387-395 [1 984]). 

As used herein, an "analogous sequence" is one wherein the function of the gene is 
essentially the same as the gene based on the Cellulomonas strain 69B4 protease. 
Additionally, analogous genes include at least 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 
85%, 90%, 95%, 97%, 98%, 99% or 100% sequence identity with the sequence of the 
Cellulomonas strain 69B4 protease. Alternately, analogous sequences have an alignment 
of between 70 to 100% of the genes found in the Cellulomonas strain 69B4 protease region 
and/or have at least between 5-10 genes found in the region aligned with the genes in the 
Cellulomonas strain 69B4 chromosome. In additional embodiments more than one of the 
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above properties applies to the sequence. Analogous sequences are determined by known 
methods of sequence alignment. A commonly used alignment method is BLAST, although 
as indicated above and below, there are other methods that also find use in aligning 
sequences. 

5 One example of a useful algorithm is PILEUP. PILEUP creates a multiple sequence 

alignment from a group of related sequences using progressive, pair-wise alignments. It 
can also plot a tree showing the clustering relationships used to create the alignment. 
PILEUP uses a simplification of the progressive alignment method of Feng and Doolittle 
(Feng and Doolittle, J. Mol. Evol., 35:351-360 [1987]). The method is similar to that 

10 described by Higgins and Sharp (Higgins and Sharp, CABIOS 5:151-153 [1989]). Useful 
PILEUP parameters including a default gap weight of 3.00, a default gap length weight of 
0.10, and weighted end gaps. 

Another example of a useful algorithm is the BLAST algorithm, described by Altschul 
et a/., (Altschul ef a/., J. Mol. Biol., 215:403-410, [1990]; and Karlin etai, Proc. Natl. Acad. 

15 Sci, USA 90:5873-5787 [1993]). A particularly useful BLAST program is the WU-BLAST-2 
program (See, Altschul ef a/., Meth. Enzymol., 266:460-480 [1996]). WU-BLAST-2 uses 
several search parameters, most of which are set to the default values. The adjustable 
parameters are set with the following values: overlap span =1, overlap fraction = 0.125, 
word threshold (T) = 11. The HSP S and HSP S2 parameters are dynamic values and are 

20 established by the program itself depending upon the composition of the particular 

sequence and composition of the particular database against which the sequence of interest 
is being searched. However, the values may be adjusted to increase sensitivity. A % amino 
acid sequence identity value is determined by the number of matching identical residues 
divided by the total number of residues of the "longer" sequence in the aligned region. The 

25 longer 11 sequence is the one having the most actual residues in the aligned region (gaps 
introduced by WU-Blast-2 to maximize the alignment score are ignored). 

Thus, °percent (%) nucleic acid sequence identity" is defined as the percentage of 
nucleotide residues in a candidate sequence that are identical with the nucleotide residues 
of the starting sequence (i.e., the sequence of interest). A preferred method utilizes the 

30 BLASTN module of WU-BLAST-2 set to the default parameters, with overlap span and 
overlap fraction set to 1 and 0.125, respectively. 

As used herein, the term "hybridization 0 refers to the process by which a strand of 
nucleic acid joins with a complementary strand through base pairing, as known in the art. 
A nucleic acid sequence is considered to be "selectively hybridizabte" to a reference 

35 nucleic acid sequence if the two sequences specifically hybridize to one another under 
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moderate to high stringency hybridization and wash conditions. Hybridization conditions are 
based on the melting temperature (Tm) of the nucleic acid binding complex or probe. For 
example, "maximum stringency" typically occurs at about Tm-5°C (5° below the Tm of the 
probe); "high stringency" at about 5-1 0°C below the Tm; "intermediate stringency" at about 
10-20°C below the Tm of the probe; and "low stringency" at about 20-25°C below the Tm. 
Functionally, maximum stringency conditions may be used to identify sequences having 
strict identity or near-strict identity with the hybridization probe; while an intermediate or low 
stringency hybridization can be used to identify or detect polynucleotide sequence 
homologs. 

Moderate and high stringency hybridization conditions are well known in the art. An 
example of high stringency conditions includes hybridization at about 42°C in 50% 
formamide, 5X SSC, 5X Denhardt's solution, 0.5% SDS and 100 ug/ml denatured carrier 
DNA followed by washing two times in 2X SSC and 0-5% SDS at room temperature and two 
additional times in 0.1 X SSC and 0.5% SDS at 42°C. An example of moderate stringent 
conditions include an overnight incubation at 37°C in a solution comprising 20% formamide, 
5 x SSC (150mM NaCI, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5 x 
Denhardt's solution, 10% dextran sulfate and 20 mg/ml denatured sheared salmon sperm 
DNA, followed by washing the filters in 1x SSC at about 37 - 50°C. Those of skill in the art 
know how to adjust the temperature, ionic strength, etc. as necessary to accommodate 
factors such as probe length and the like. 

As used herein, "recombinanf includes reference to a cell or vector, that has been 
modified by the introduction of a heterologous nucleic acid sequence or that the cell is 
derived from a cell so modified. Thus, for example, recombinant cells express genes that 
are not found in identical form within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all as a result of deliberate human intervention. "Recombination," 
"recombining," and generating a "recombined" nucleic acid are generally the assembly of 
two or more nucleic acid fragments wherein the assembly gives rise to a chimeric gene. 

In a preferred embodiment, mutant DNA sequences are generated with site 
saturation mutagenesis in at least one codon. In another preferred embodiment, site 
saturation mutagenesis is performed for two or more codons. In a further embodiment, 
mutant DNA sequences have more than 50%, more than 55%, more than 60%, more than 
65%, more than 70%, more than 75%, more than 80%, more than 85%, more than 90%, 
more than 95%, or more than 98% homology with the wild-type sequence. In alternative 
embodiments, mutant DNA is generated in vivo using any known mutagenic procedure such 
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as, for example, radiation, nitrosoguanidine and the like. The desired DNA sequence is then 
isolated and used in the methods provided herein. 

As used herein, the term "target sequence" refers to a DNA sequence in the host cell 
that encodes the sequence where it is desired for the incoming sequence to be inserted into 
the host cell genome. In some embodiments, the target sequence encodes a functional 
wild-type gene or operon, while in other embodiments the target sequence encodes a 
functional mutant gene or operon, or a non-functional gene or operon. 

As used herein, a "flanking sequence" refers to any sequence that is either upstream 
or downstream of the sequence being discussed (e.g., for genes A-B-C, gene B is flanked 
by the A and C gene sequences). In a preferred embodiment, the incoming sequence is 
flanked by a homology box on each side. In another embodiment, the incoming sequence 
and the homology boxes comprise a unit that is flanked by stuffer sequence on each side. 
In some embodiments, a flanking sequence is present on only a single side (either 3' or 5'), 
but in preferred embodiments, it is on each side of the sequence being flanked; In some 
embodiments, a flanking sequence is present on only a single side (either 3' or 5'), while in 
preferred embodiments, it is present on each side of the sequence being flanked. 

As used herein, the term "stuffer sequence" refers to any extra DNA that flanks 
homology boxes (typically vector sequences). However, the term encompasses any non- 
homologous DNA sequence. Not to be limited by any theory, a stuffer sequence provides a 
noncritical target for a cell to initiate DNA uptake. 

As used herein, the terms "amplification 0 and "gene amplification 11 refer to a process 
by which specific DNA sequences are disproportionately replicated such that the amplified 
gene becomes present in a higher copy number than was initially present in the genome. In 
some embodiments, selection of cells by growth in the presence of a drug (e.g., an inhibitor 
of an inhibitable enzyme) results in the amplification of either the endogenous gene 
encoding the gene product required for growth in the presence of the drug or by 
amplification of exogenous (i.e., input) sequences encoding this gene product, or both. 

"Amplification" is a special case of nucleic acid replication involving template 
specificity. It is to be contrasted with non-specific template replication (i.e., replication that is 
template-dependent but not dependent on a specific template). Template specificity is here 
distinguished from fidelity of replication (i.e., synthesis of the proper polynucleotide 
sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently 
described in terms of "target" specificity. Target sequences are "targets" in the sense that 
they are sought to be sorted out from other nucleic acid. Amplification techniques have 
been designed primarily for this sorting out. 
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As used herein, the term "co-amplification" refers to the introduction into a single cell 
of an amplifiable marker in conjunction with other gene sequences (i.e., comprising one or 
more non-selectable genes such as those contained within an expression vector) and the 
application of appropriate selective pressure such that the cell amplifies both the amplifiable 
marker and the other, non-selectable gene sequences. The amplifiable marker may be 
physically linked to the other gene sequences or alternatively two separate pieces of DNA, 
one containing the amplifiable marker and the other containing the non-selectable marker, 
may be introduced into the same cell. 

As used herein, the terms "amplifiable marker," "amplifiable gene," and "amplification 
vector" refer to a gene or a vector encoding a gene which permits the amplification of that 
gene under appropriate growth conditions. 

'Template specificity" is achieved in most amplification techniques by the choice of 
enzyme. Amplification enzymes are enzymes that, under conditions they are used, will 
process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid. 
For example, in the case of Q(3 replicase, MDV-1 RNA is the specific template for the 
replicase (See e.g., Kacian et a/., Proc. Natl. Acad. Sci. USA 69:3038 [1972]). Other nucleic 
acids are not replicated by this amplification enzyme. Similarly, in the case of T7 RNA 
polymerase, this amplification enzyme has a stringent specificity for its own promoters (See, 
Chamberlin et ai, Nature 228:227 [1970]). In the case of T4 DNA ligase, the enzyme will 
not ligate the two oligonucleotides or polynucleotides, where there is a mismatch between 
the oligonucleotide or polynucleotide substrate and the template at the ligation junction 
(See, Wu and Wallace, Genomics 4:560 [1989]). Finally, Tag and Pfu polymerases, by 
virtue of their ability to function at high temperature, are found to display high specificity for 
the sequences bounded and thus defined by the primers; the high temperature results in 
thermodynamic conditions that favor primer hybridization with the target sequences and not 
hybridization with non-target sequences. 

As used herein, the term "amplifiable nucleic acid 0 refers to nucleic acids which may 
be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid" 
will usually comprise "sample template." 

As used herein, the term "sample template" refers to nucleic acid originating from a 
sample which is analyzed for the presence of "target" (defined below). In contrast, 
"background template" is used in reference to nucleic acid other than sample template 
which may or may not be present in a sample. Background template is most often 
inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic 
acid contaminants sought to be purified away from the sample. For example, nucleic acids 
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from organisms other than those to be detected may be present as background in a test 
sample. 

As used herein, the term "primer" refers to an oligonucleotide, whether occurring 
naturally as in a purified restriction digest or produced synthetically, which is capable of 
acting as a point of initiation of synthesis when placed under conditions in which synthesis of 
a primer extension product which is complementary to a nucleic acid strand is induced, (/.&, 
in the presence of nucleotides and an inducing agent such as DNA polymerase and at a 
suitable temperature and pH). The primer is preferably single stranded for maximum 
efficiency in amplification, but may alternatively be double stranded. If double stranded, the 
primer is first treated to separate its strands before being used to prepare extension 
products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be 
sufficiently long to prime the synthesis of extension products in the presence of the inducing 
agent. The exact lengths of the primers will depend on many factors, including temperature, 
source of primer and the use of the method. 

As used herein, the term "probe" refers to an oligonucleotide {i.e., a sequence of 
nucleotides), whether occurring naturally as in a purified restriction digest or produced 
synthetically, recombinantly or by PCR amplification, which is capable of hybridizing to 
another oligonucleotide of interest. A probe may be single-stranded or double-stranded. 
Probes are useful in the detection, identification and isolation of particular gene sequences. 
It is contemplated that any probe used in the present invention will be labeled with any 
"reporter molecule," so that is detectable in any detection system, including, but not limited 
to enzyme (e r g., ELISA, as well as enzyme-based histochemical assays), fluorescent, 
radioactive, and luminescent systems. It is not intended that the present invention be limited 
to any particular detection system or label. 

As used herein, the term "target," when used in reference to the polymerase chain 
reaction, refers to the region of nucleic acid bounded by the primers used for polymerase 
chain reaction. Thus, the "target" is sought to be sorted out from other nucleic acid 
sequences. A "segment" is defined as a region of nucleic acid within the target sequence. 

As used herein, the term "polymerase chain reaction" ("PCR") refers to the methods 
of U.S. Patent Nos. 4,683,195 4,683,202, and 4,965,188, hereby incorporated by reference, 
which include methods for increasing the concentration of a segment of a target sequence 
in a mixture of genomic DNA without cloning or purification. This process for amplifying the 
target sequence consists of introducing a large excess of two oligonucleotide primers to the 
DNA mixture containing the desired target sequence, followed by a precise sequence of 
thermal cycling in the presence of a DNA polymerase. The two primers are complementary 
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to their respective strands of the double stranded target sequence. To effect amplification, 
the mixture is denatured and the primers then annealed to their complementary sequences 
within the target molecule. Following annealing, the primers are extended with a 
polymerase so as to form a new pair of complementary strands. The steps of denaturation, 
primer annealing and polymerase extension can be repeated many times (i.e., denaturation, 
annealing and extension constitute one "cycle"; there can be numerous "cycles") to obtain a 
high concentration of an amplified segment of the desired target sequence. The length of 
the amplified segment of the desired target sequence is determined by the relative positions 
of the primers with respect to each other, and therefore, this length is a controllable 
parameter. By virtue of the repeating aspect of the process, the method is referred to as the 
"polymerase chain reaction" (hereinafter "PCR"). Because the desired amplified segments 
of the target sequence become the predominant sequences (in terms of concentration) in 
the mixture, they are said to be "PCR amplified". 

As used herein, the term "amplification reagents" refers to those reagents 
(deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for 
primers, nucleic acid template and the amplification enzyme. Typically, amplification 
reagents along with other reaction components are placed and contained in a reaction 
vessel (test tube, microwell, etc.). 

With PCR, it is possible to amplify a single copy of a specific target sequence in 
genomic DNA to a level detectable by several different methodologies {e.g., hybridization 
with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme 
conjugate detection; incorporation of ^P-labeled deoxynucleotide triphosphates, such as 
dCTP or dATP, into the amplified segment). In addition to genomic DNA, any 
oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of 
primer molecules. In particular, the amplified segments created by the PCR process Itself 
are, themselves, efficient templates for subsequent PCR amplifications. 

As used herein, the terms "PCR product," "PCR fragment," and "amplification 
product" refer to the resultant mixture of compounds after two or more cycles of the PCR 
steps of denaturation, annealing and extension are complete. These terms encompass the 
case where there has been amplification of one or more segments of one or more target 
sequences. 

As used herein, the term "RT-PCR" refers to the replication and amplification of RNA 
sequences. In this method, reverse transcription is coupled to PCR, most often using a one 
enzyme procedure in which a thermostable polymerase is employed, as described in U.S. 
Patent No. 5,322,770, herein incorporated by reference. In RT-PCR, the RNA template is 



WO 2005/052146 



PCT/US2004/039066 



-42- 

converted to cDNA due to the reverse transcriptase activity of the polymerase, and then 
amplified using the polymerizing activity of the polymerase {i.e., as in other PCR methods). 

As used herein, the terms "restriction endonucleases 0 and "restriction enzymes" 
refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific 
nucleotide sequence. 

A "restriction site" refers to a nucleotide sequence recognized and cleaved by a 
given restriction endonuclease and is frequently the site for insertion of DNA fragments. In 
certain embodiments of the invention restriction sites are engineered into the selective 
marker and into 5" and 3' ends of the DNA construct. 

As used herein, the term "chromosomal integration" refers to the process whereby 
an incoming sequence is introduced into the chromosome of a host cell. The homologous 
regions of the transforming DNA align with homologous regions of the chromosome. 
Subsequently, the sequence between the homology boxes is replaced by the incoming 
sequence in a double crossover (i.e., homologous recombination). In some embodiments 
of the present invention, homologous sections of an inactivating chromosomal segment of a 
DNA construct align with the flanking homologous regions of the indigenous chromosomal 
region of the Bacillus chromosome. Subsequently, the indigenous chromosomal region is 
deleted by the DNA construct in a double crossover (i.e., homologous recombination). 

"Homologous recombination" means the exchange of DNA fragments between two 
DNA molecules or paired chromosomes at the site of identical or nearly identical nucleotide 
sequences. In a preferred embodiment, chromosomal integration is homologous 
recombination. 

"Homologous sequences" as used herein means a nucleic acid or polypeptide 
sequence having 100%, 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 88%, 
85%, 80%, 75%, or 70% sequence identity to another nucleic acid or polypeptide sequence 
when optimally aligned for comparison. In some embodiments, homologous sequences 
have between 85% and 100% sequence identity, while in other embodiments there is 
between 90% and 100% sequence identity, and in more preferred embodiments, there is 
95% and 100% sequence identity. 

As used herein "amino acid" refers to peptide or protein sequences or portions 
thereof. The terms "protein," "peptide," and "polypeptide" are used interchangeably. 

As used herein, "protein of interesf and "polypeptide of interest" refer to a 
protein/polypeptide that is desired and/or being assessed. In some embodiments, the 
protein of interest is expressed intracellularly, while in other embodiments, it is a secreted 
polypeptide. In particularly preferred embodiments, these enzyme include the serine 
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proteases of the present invention. In some embodiments, the protein of interest is a 
secreted polypeptide which is fused to a signal peptide (i.e., an amino-terminal extension on 
a protein to be secreted). Nearly all secreted proteins use an amino- terminal protein 
extension which plays a crucial role in the targeting to and translocation of precursor 
proteins across the membrane. This extension is proteolytically removed by a signal 
peptidase during or immediately following membrane transfer. 

As used herein, the term "heterologous protein" refers to a protein or polypeptide 
that does not naturally occur in the host cell. Examples of heterologous proteins include 
enzymes such as hydrolases including proteases. In some embodiments, the gene 
encoding the proteins are naturally occurring genes, while in other embodiments, mutated 
and/or synthetic genes are used. 

As used herein, "homologous protein" refers to a protein or polypeptide native or 
naturally occurring in a cell. In preferred embodiments, the cell is a Gram-positive cell, while 
in particularly preferred embodiments, the cell is a Bacillus host cell. In alternative 
embodiments, the homologous protein is a native protein produced by other organisms, 
including but not limited to E. coli, Streptomyces, Trichoderma, and Aspergillus. The 
invention encompasses host cells producing the homologous protein via recombinant DNA 
technology. 

As used herein, an "operon region" comprises a group of contiguous genes that are 
transcribed as a single transcription unit from a common promoter, and are thereby subject 
to co-regulation. In some embodiments, the operon includes a regulator gene. In most 
preferred embodiments, operons that are highly expressed as measured by RNA levels, but 
have an unknown or unnecessary function are used. 

As used herein, an "antimicrobial region" is a region containing at least one gene that 
encodes an antimicrobial protein. 

A polynucleotide is said to "encode" an RNA or a polypeptide if, in its native state or 
when manipulated by methods known to those of skill in the art, it can be transcribed and/or 
translated to produce the RNA, the polypeptide or a fragment thereof. The anti-sense 
strand of such a nucleic acid is also said to encode the sequences. 

As is known in the art, a DNA can be transcribed by an RNA polymerase to produce 
RNA, but an RNA can be reverse transcribed by reverse transcriptase to produce a DNA. 
Thus a DNA can encode a RNA and vice versa. 

The term "regulatory segment" or "regulatory sequence" or "expression control 
sequence" refers to a polynucleotide sequence of DNA that is operatively linked with a 
polynucleotide sequence of DNA that encodes the amino acid sequence of a polypeptide 
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chain to effect the expression of the encoded amino acid sequence. The regulatory 
sequence can inhibit, repress, or promote the expression of the operably linked 
polynucleotide sequnce encoding the amino acid. 

"Host strain" or "host cell" refers to a suitable host for an expression vector 
s comprising DNA according to the present invention. 

An enzyme is "overexpressed" in a host cell if the enzyme is expressed in the cell at 
a higher level that the level at which it is expressed in a corresponding wild-type cell. 

The terms "protein" and "polypeptide 0 are used interchangeability herein. The 3-letter 
code for amino acids as defined in conformity with the IUPAC-IUB Joint Commission on 
10 Biochemical Nomenclature (JCBN) is used through out this disclosure. It is also understood 
that a polypeptide may be coded for by more than one nucleotide sequence due to the 
degeneracy of the genetic code. 

A "prosequence" is an amino acid sequence between the signal sequence and 
mature protease that is necessary for the secretion of the protease. Cleavage of the pro 
is sequence will result in a mature active protease. 

The term "signal sequence" or "signal peptide" refers to any sequence of nucleotides 
and/or amino acids which may participate in the secretion of the mature or precursor forms 
of the protein. This definition of signal sequence is a functional one, meant to include all 
those amino acid sequences encoded by the N-terminal portion of the protein gene, which 
20 participate in the effectuation of the secretion of protein. They are often, but not universally, 
bound to the N-terminal portion of a protein or to the N-terminal portion of a precursor 
protein. The signal sequence may be endogenous or exogenous. The signal sequence 
may be that normally associated with the protein (e.g., protease), or may be from a gene 
encoding another secreted protein. One exemplary exogenous signal sequence comprises 
25 the first seven amino acid residues of the signal sequence from Bacillus subtilis subtilisin 
fused to the remainder of the signal sequence of the subtilisin from Bacillus lentus (ATCC 
21536). 

The term "hybrid signal sequence" refers to signal sequences in which part of 
sequence is obtained from the expression host fused to the signal sequence of the gene to 
30 be expressed. In some embodiments, synthetic sequences are utilized. 

The term "substantially the same signal activity" refers to the signal activity, as 
indicated by substantially the same secretion of the protease into the fermentation medium, 
for example a fermentation medium protease level being at least 50%, at least 60%, at least 
70%, at least 80%, at least 90%, at least 95%, at least 98% of the secreted protease levels 
35 in the fermentation medium as provided by the signal sequence of SEQ ID NOS:5 and/or 9. 
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The term "mature" form of a protein or peptide refers to the final functional form of 
the protein or peptide. To exemply, a mature form of the protease of the present invention 
at least includes the amino acid sequence identical to residue positions 1-189 of SEQ ID 
NO:8. 

5 The term "precursor" form of a protein or peptide refers to a mature form of the 

protein having a prosequence operably linked to the amino or carbonyl terminus of the 
protein. The precursor may also have a "signal" sequence operably linked, to the amino 
terminus of the prosequence. The precursor may also have additional polynucleotides that 
are involved in post-translational activity (e.g., polynucleotides cleaved therefrom to leave 

10 the mature form of a protein or peptide). 

"Naturally occurring enzyme" refers to an enzyme having the unmodified amino acid 
sequence identical to that found in nature. Naturally occurring enzymes include native 
enzymes, those enzymes naturally expressed or found in the particular microorganism. 

The terms "derived from" and "obtained from" refer to not only a protease produced 

15 or producible by a strain of the organism in question, but also a protease encoded by a DNA 
sequence isolated from such strain and produced in a host organism containing such DNA 
sequence. Additionally, the term refers to a protease which is encoded by- a DNA sequence 
of synthetic and/or cDNA origin and which has the identifying characteristics of the protease 
in question. To exemplify, "proteases derived from Cellulomonas" refers to those enzymes 

20 having proteolytic activity which are naturally-produced by Cellulomonas, as well as to serine 
proteases like those produced by Cellulomonas sources but which through the use of 
genetic engineering techniques are produced by non-Cellulomonas organisms transformed 
with a nucleic acid encoding said serine proteases. 

A "derivative" within the scope of this definition generally retains the characteristic 

25 proteolytic activity observed in the wild-type, native or parent form to the extent that the 
derivative is useful for similar purposes as the wild-type, native or parent form. Functional 
derivatives of serine protease encompass naturally occurring, synthetically or recombinantly 
produced peptides or peptide fragments which have the general characteristics of the serine 
protease of the present invention. 

30 The term "functional derivative" refers to a derivative of a nucleic acid which has the 

functional characteristics of a nucleic acid which encodes serine protease. Functional 
derivatives of a nucleic acid which encode serine protease of the present invention 
encompass naturally occurring, synthetically or recombinantly produced nucleic acids or 
fragments and encode serine protease characteristic of the present invention. Wild type 

35 nucleic acid encoding serine proteases according to the invention include naturally occurring 
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alleles and homologues based on the degeneracy of the genetic code known in the art. 

The term "identical" in the context of two nucleic acids or polypeptide sequences 
refers to the residues in the two sequences that are the same when aligned for maximum 
correspondence, as measured using one of the following sequence comparison or analysis 
5 algorithms. 

The term "optimal alignment" refers to the alignment giving the highest percent 
identity score. 

"Percent sequence identity," "percent amino acid sequence identity," "percent gene 
sequence identity," and/or "percent nucleic acid/polynucloetide sequence identity," with 

10 respect to two amino acid, polynucleotide and/or gene sequences (as appropriate), refer to 
the percentage of residues that are identical in the two sequences when the sequences are 
optimally aligned. Thus, 80% amino acid sequence, identity means that 80% of the amino 
acids in two optimally aligned polypeptide sequences are identical. 

The phrase "substantially identical" in the context of two nucleic acids or 

15 polypeptides thus refers to a polynucleotide or polypeptide that comprising at least 70% 
sequence identity, preferably at least 75%, preferably at least 80%, preferably at least 85%, 
. preferably at least 90%, preferably at least 95% , preferably at least 97% , preferably at 
least 98% and preferably at least 99% sequence identity as compared to a reference 
sequence using the programs or algorithms (e.g., BLAST, ALIGN, CLUSTAL) using 

20 standard parameters. One indication that two polypeptides are substantially identical is that 
the first polypeptide is immunologically crossrreactive with the second polypeptide. 
Typically, polypeptides that differ by conservative amino acid substitutions are 
immunologically cross-reactive. Thus, a polypeptide is substantially identical to a second 
polypeptide, for example, where the two peptides differ only by a conservative substitution. 

25 Another indication that two nucleic acid sequences are substantially identical is that the two 
molecules hybridize to each other under stringent conditions (e.g., within a range of medium 
to high stringency). 

The phrase "equivalent," in this context, refers to serine proteases enzymes that are 
encoded by a polynucleotide capable of hybridizing to the polynucleotide having the 

30 sequence as shown in SEQ ID NO:1 , under conditions of medium to maximal stringency. 
For example, being equivalent means that an equivalent mature serine protease comprises 
at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 
92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98% 
and/or at least 99% sequence identity to the mature Cellulomonas serine protease having 

35 the amino acid sequence of SEQ ID NO:8. 
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The term "isolated" or "purified" refers to a material that is removed from its original 
environment {e.g., the natural environment if it is naturally occurring). For example, the 
material is said to be "purified 0 when it is present in a particular composition in a higher or 
lower concentration than exists in a naturally occurring or wild type organism or in 

5 combination with components not normally present upon expression from a naturally 
occurring or wild type organism. For example, a naturally-occurring polynucleotide or 
polypeptide present in a living animal is not isolated, but the same polynucleotide or 
polypeptide, separated from some or all of the coexisting materials in the natural system, is : 
isolated. Such polynucleotides could be part of a vector, and/or such polynucleotides or 

10 polypeptides could be part of a composition, and still be isolated in that such vector or 

composition is not part of its natural environment. In preferred embodiments, a nucleic acid 
or protein is said to be purified, for example, if it gives rise to essentially one band in an 
electrophoretic gel or blot. 

The term "isolated", when used in reference to a DNA sequence, refers to a DNA 

is sequence that has been removed from its natural genetic milieu and is thus free of other 
extraneous or unwanted coding sequences, and is in a form suitable for use within 
genetically engineered . protein production systems. Such isolated molecules are those that 
are separated from their natural environment and include cDNA and genomic clones. 
Isolated DNA molecules of the present invention are free of other genes with which they are 

20 ordinarily associated, but may include naturally occurring 5' and 3' untranslated regions such 
as promoters and terminators. The identification of associated regions will be evident to one 
of ordinary skill in the art (See e.g., Dynan and Tijan, Nature 316:774-78 [1985]). The term 
"an isolated DNA sequence" is alternatively referred to as "a cloned DNA sequence". 

The term "isolated," when used in reference to a protein, refers to a protein that is 

25 found in a condition other than its native environment. In a preferred form, the isolated 
protein is substantially free of other proteins, particularly other homologous proteins. An 
isolated protein is more than 10% pure, preferably more than 20% pure, and even more 
preferably more than 30% pure, as determined by SDS-PAGE. Further aspects of the 
invention encompass the protein in a highly purified form (i.e., more than 40% pure, more 

ao than 60% pure, more than 80% pure, more than 90% pure, more than 95% pure, more than 
97% pure, and even more than 99% pure), as determined by SDS-PAGE. 

As used herein, the term, "combinatorial mutagenesis" refers to methods in which 
libraries of variants of a starting sequence are generated. In these libraries, the variants 
contain one or several mutations chosen from a predefined set of mutations. In addition, the 

35 methods provide means to introduce randpm mutations which were not members of the 
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predefined set of mutations. In some embodiments, the methods include those set forth in 
U.S. Patent Appln. Ser. No. 09/699.250, filed October 26, 2000, hereby incorporated by 
reference. In alternative embodiments, combinatorial mutagenesis methods encompass 
commercially available kits (e.g., QuikChange® Multisite, Stratagene, San Diego, CA). 
s As used herein, the term "library of mutants" refers to a population of cells which are 

identical in most of their genome but include different homologues of one or more genes. 
Such libraries can be used, for example, to identify genes or operons with improved traits. 

As used herein, the term "starting gene" refers to a gene of interest that encodes a 
protein of interest that is to be improved and/or changed using the present invention. 
10 - As used herein, the term "multiple sequence alignment" ("MSA") refers to the 

sequences of multiple homologs of a starting gene that are aligned using an algorithm (e.g., 
Clustal W). 

As used herein, the terms "consensus sequence" and "canonical sequence" refer to 
an archetypical amino acid sequence against which all variants of a particular protein or 
15 sequence of interest are compared. The terms also refer to a sequence that sets forth the 
nucleotides that are most often present in a DNA sequence of interest. For each position of 
a gene, the consensus sequence gives the amino acid that is most abundant in that position 
in the MSA. 

As used herein, the term "consensus mutation" refers to a difference in the sequence 
20 of a starting gene and a consensus sequence. Consensus mutations are identified by 
comparing the sequences of the starting gene and the consensus sequence resulting from 
an MSA. In some embodiments, consensus mutations are introduced into the starting gene 
such that it becomes more similar to the consensus sequence. Consensus mutations also 
include amino acid changes that change an amino acid in a starting gene to an amino acid 
25 that is more frequently found in an MSA at that position relative to the frequency of that 
amino acid in the starting gene. Thus, the term consensus mutation comprises all single 
amino acid changes that replace an amino acid of the starting gene with an amino acid that 
is more abundant than the amino acid in the MSA. 

As used herein, the term "initial hif refers to a variant that was identified by 
30 screening a combinatorial consensus mutagenesis library. In preferred embodiments, initial 
hits have improved performance characteristics, as compared to the starting gene. 

As used herein, the term "improved hif refers to a variant that was identified by 
screening an enhanced combinatorial consensus mutagenesis library. 

As used herein, the terms "improving mutation" and "performance-enhancing 
35 mutation" refer to a mutation that leads to improved performance when it is introduced into 
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the starting gene. In some preferred embodiments, these mutations are identified by 
sequencing hits that were identified during the screening step of the method. In most 
embodiments, mutations that are more frequently found in hits are likely to be improving 
mutations, as compared to an unscreened combinatorial consensus mutagenesis library. 

As used herein, the term "enhanced combinatorial consensus mutagenesis library 11 
refers to a CCM library that is designed and constructed based on screening and/or 
sequencing results from an earlier round of CCM mutagenesis and screening. In some 
embodiments, the enhanced CCM library is based on the sequence of an initial hit resulting 
from an earlier round of CCM. In additional embodiments, the enhanced CCM is designed 
such that mutations that were frequently observed in initial hits from earlier rounds of 
mutagenesis and screening are favored. In some preferred embodiments, this is 
accomplished by omitting primers that encode performance-reducing mutations or by 
increasing the concentration of primers that encode performance-enhancing mutations 
relative to other primers, that were used in earlier CCM libraries. 

As used herein, the term "performance-reducing mutations" refer to mutations in the 
combinatorial consensus mutagenesis library that are less frequently found in hits resulting 
from screening as compared to an unscreened combinatorial consensus mutagenesis 
library. In preferred embodiments, the screening process removes and/or reduces the 
abundance of variants that contain "performance-reducing mutations." 

As used herein, the term "functional assay" refers to an assay that provides an 
indication of a protein's activity. In particularly preferred embodiments, the term refers to 
assay systems in which a protein is analyzed for its ability to function in its usual capacity. 
For example, in the case of enzymes, a functional assay involves determining the 
effectiveness of the enzyme in catalyzing a reaction. 

As used herein, the term "target property" refers to the property of the starting gene 
that is to be altered. It is not intended that the present invention be limited to any particular 
target property. However, in some preferred embodiments, the target property is the 
stability of a gene product (e.g., resistance to denaturation, proteolysis or other degradative 
factors), while in other embodiments, the level of production in a production host is altered. 
Indeed, it is contemplated that any property of a starting gene will find use in the present 
invention. 

The term "property" or grammatical equivalents thereof in the context of a nucleic 
acid, as used herein, refer to any characteristic or attribute of a nucleic acid that can be 
selected or detected. These properties include, but are not limited to, a property affecting 
binding to a polypeptide, a property conferred on a cell comprising a particular nucleic acid, 
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a property affecting gene transcription (e.g., promoter strength, promoter recognition, 
promoter regulation, enhancer function), a property affecting RNA processing (e.g., RNA 
splicing, RNA stability, RNA conformation, and post-transcriptional modification), a property 
affecting translation (e.g., level, regulation, binding of mRNA to ribosomal proteins, post- 
5 translational modification). For example, a binding site for a transcription factor, 

polymerase, regulatory factor, etc., of a nucleic acid may be altered to produce desired 
characteristics or to identify undesirable characteristics. 

The term "property" or grammatical equivalents thereof in the context of a 
polypeptide, as used herein, refer to any characteristic or attribute of a polypeptide that can 

10 be selected or detected. These properties include, but are not limited to oxidative stability, 
substrate specificity, catalytic activity, thermal stability, alkaline stability, pH activity profile, 
resistance to proteolytic degradation, Km, kcat, WkM ratio, protein folding, inducing an 
immune response, ability to bind to a ligand, ability to bind to a receptor, ability to be 
secreted, ability to be displayed on the surface of a cell, ability to oligomerize, ability to 

15 signal, ability to stimulate cell proliferation, ability to inhibit cell proliferation, ability to induce 
apoptosis, ability to be modified by phosphorylation or glycosylation, ability to treat disease. 

As used.herein, the term "screening" has its usual meaning in the art and is, in 
general a multi-step process. In the first step, a mutant nucleic acid or variant polypeptide 
therefrom is provided. In the second step, a property of the mutant nucleic acid or variant 

20 polypeptide is determined. In the third step, the determined property is compared to a 
property of the corresponding precursor nucleic acid, to the property of the corresponding 
naturally occurring polypeptide or to the property of the starting material {e.g., the initial 
sequence) for the generation of the mutant nucleic acid. 

It will be apparent to the skilled artisan that the screening procedure for obtaining a 

25 nucleic acid or protein with an altered property depends upon the property of the starting 
material the modification of which the generation of the mutant nucleic acid is intended to 
facilitate. The skilled artisan will therefore appreciate that the invention is not limited to any 
specific property to be screened for and that the following description of properties lists 
illustrative examples only. Methods for screening for any particular property are generally 

30 described in the art. For example, one can measure binding, pH, specificity, etc., before 
and after mutation, wherein a change indicates an alteration. Preferably, the screens are 
performed in a high-throughput manner, including multiple samples being screened 
simultaneously, including, but not limited to assays utilizing chips, phage display, and 
multiple substrates and/or indicators. 

35 As used herein, in some embodiments, screens encompass selection steps in which 
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variants of interest are enriched from a population of variants. Examples of these 
embodiments include the selection of variants that confer a growth advantage to the host 
organism, as well as phage display or any other method of display, where variants can be 
captured from a population of variants based on their binding or catalytic properties. In a 
5 preferred embodiment, a library of variants is exposed to stress (heat, protease, 

denaturation) and subsequently variants that are still intact are identified in a screen or 
enriched by selection. It is intended that the term encompass any suitable means for 
selection. Indeed, it is not intended that the present invention be limited to any particular 
method of screening. 

io As used herein, the term "targeted randomization" refers to a process that produces 

a plurality of sequences where one or several positions have been randomized. In some 
embodiments, randomization is complete (i.e., all four nucleotides, A, T, G, and C can occur 
at a randomized position. In alternative embodiments, randomization of a nucleotide is 
limited to a subset of the four nucleotides. Targeted randomization can be applied to one or 

is several codons of a sequence, coding for one or several proteins of interest. When 

expressed, the resulting libraries produce protein populations in which one or more amino 
acid positions can contain a mixture of all 20 amino acids or a subset of amino acids, as 
determined by the randomization scheme of the randomized codon. In some embodiments, 
the individual members of a population resulting from targeted randomization differ in the 

20 number of amino acids, due to targeted or random insertion or deletion of codons. In further 
embodiments, synthetic amino acids are included in the protein populations produced- In 
some preferred embodiments, the majority of members of a population resulting from 
targeted randomization show greater sequence homology to the consensus sequence than 
the starting gene. In some embodiments, the sequence encodes one or more proteins fo 

25 interest. In alternative embodiments, the proteins have differing biological functions. In 
some preferred embodiments, the incoming sequence comprises at least one selectable 
marker. 

The terms "modified sequence" and "modified genes" are used interchangeably 
herein to refer to a sequence that includes a deletion, insertion or interruption of naturally 

30 occurring nucleic acid sequence. In some preferred embodiments, the expression product 
of the modified sequence is a truncated protein (e.g., if the modification is a deletion or 
interruption of the sequence). In some particularly preferred embodiments, the truncated 
protein retains biological activity. In alternative embodiments, the expression product of the 
modified sequence is an elongated protein (e.g., modifications comprising an insertion into 

35 the nucleic acid sequence). In some embodiments, an insertion leads to a truncated protein 
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(e.g., when the insertion results in the formation of a stop codon). Thus, an insertion may 
result in either a truncated protein or an elongated protein as an expression product. 

As used herein, the terms "mutant sequence" and "mutant gene" are used 
interchangeably and refer to a sequence that has an alteration in at least one codon 
5 occurring in a host cell's wild-type sequence. The expression product of the mutant 
sequence is a protein with an altered amino acid sequence relative to the wild-type. The 
expression product may have an altered functional capacity {e.g., enhanced enzymatic 
activity). 

The terms "mutagenic primer" or "mutagenic oligonucleotide" (used interchangeably 

10 herein) are intended to refer to oligonucleotide compositions which correspond to a portion 
of the template sequence and which are capable of hybridizing thereto. With respect to 
mutagenic primers, the primer will not precisely match the template nucleic acid, the 
mismatch or mismatches in the primer being used to introduce the desired mutation into the 
nucleic acid library. As used herein, "non-mutagenic primer" or "non-mutagenic 

15 oligonucleotide" refers to oligonucleotide compositions which will match precisely to the 
template nucleic acid. In one embodiment of the invention, only mutagenic primers are 
used. In another preferred embodiment of the invention, the primers are designed so that 
for at least one region at which a mutagenic primer has been included, there is also non- 
mutagenic primer included in the oligonucleotide mixture. By adding a mixture of mutagenic 

20 primers and non-mutagenic primers corresponding to at least one of the mutagenic primers, 
it is possible to produce a resulting nucleic acid library in which a variety of combinatorial 
mutational patterns are presented. For example, if it is desired that some of the members of 
the mutant nucleic acid library retain their precursor sequence at certain positions while 
other members are mutant at such sites, the non-mutagenic primers provide the ability to 

25 obtain a specific level of non-mutant members within the nucleic acid library for a given 
residue. The methods of the invention employ mutagenic and non-mutagenic 
oligonucleotides which are generally between 10-50 bases in length, more preferably about 
15-45 bases in length. However, it may be necessary to use primers that are either shorter 
than 10 bases or longer than 50 bases to obtain the mutagenesis result desired. With 

30 respect to corresponding mutagenic and non-mutagenic primers, it is not necessary that the 
corresponding oligonucleotides be of identical length, but only that there is overlap in the 
region corresponding to the mutation to be added. 

Primers may be added in a pre-defined ratio according to the present invention. For 
example, if it is desired that the resulting library have a significant level of a certain specific 

35 mutation and a lesser amount of a different mutation at the same or different site, by 
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adjusting the amount of primer added, it is possible to produce the desired biased library. 
Alternatively, by adding lesser or greater amounts of non-mutagenic primers, it is possible to 
adjust the frequency with which the corresponding mutation(s) are produced in the mutant 
nucleic acid library. 

As used herein, the phrase "contiguous mutations" refers to mutations which are 
presented within the same oligonucleotide primer. For example, contiguous mutations may 
be adjacent or nearby each other, however, they will be introduced into the resulting mutant 
template nucleic acids by the same primer. 

As used herein, the phrase "discontiguous mutations" refers to mutations which are 
presented in separate oligonucleotide primers. For example, discontiguous mutations will 
be introduced into the resulting mutant template nucleic acids by separately prepared 
oligonucleotide primers. 

The terms "wild-type sequence," or "wild-type gene" are used interchangeably 
herein, to refer to a sequence that is native or naturally occurring in a host cell. In some 
embodiments, the wild-type sequence refers to a sequence of interest that is the starting 
point of a protein engineering project. The wild-type sequence may encode either a 
homologous or heterologous protein. A homologous protein is one the host cell would 
produce without intervention. A heterologous protein is one that the host cell would not 
produce but for the intervention. 

As used herein, the term "antibodies" refers to immunoglobulins. Antibodies include 
but are not limited to immunoglobulins obtained directly from any species from which it is 
desirable to produce antibodies. In addition, the present invention encompasses modified 
antibodies. The term also refers to antibody fragments that retain the ability to bind to the 
epitope that the intact antibody binds and include polyclonal antibodies, monoclonal 
antibodies, chimeric antibodies, anti-idiotype (anti-ID) antibodies. Antibody fragments 
include, but are not limited to the complementarity-determining regions (CDRs), single-chain 
fragment variable regions (scFv), heavy chain variable region (VH), light chain variable 
region (VL). Polyclonal and monoclonal antibodies are also encompassed by the present 
invention. Preferably, the antibodies are monoclonal antibodies. 

The term "oxidation stable 0 refers to proteases of the present invention that retain a 
specified amount of enzymatic activity over a given period of time under conditions 
prevailing during the proteolytic, hydrolyzing, cleaning or other process of the invention, for 
example while exposed to or contacted with bleaching agents or oxidizing agents. In some 
embodiments, the proteases retain at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 
95%, 96%, 97%, 98% or 99% proteolytic activity after contact with a bleaching or oxidizing 
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agent over a given time period, for example, at least 1 minute, 3 minutes, 5 minutes, 8 
. . minutes, 12 minutes, 16 minutes, 20 minutes, etc. In some embodiments, the stability is 
measured as described in the Examples. 

The term "chelator stable" refers to proteases of the present invention that retain a 
5 specified amount of enzymatic activity over a given period of time under conditions 

prevailing during the proteolytic, hydroiyzing, cleaning or other process of the invention, for 
example while exposed to or contacted with chelating agents. In some embodiments, the 
proteases retain at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 
98% or 99% proteolytic activity after contact with a chelating agent over a given time period, 

10 for example, at least 10 minutes, 20 minutes, 40 minutes, 60 minutes, 100 minutes, etc. In 
some embodiments, the chelator stability is measured as described in the Examples. 

The terms "thermally stable" and "thermostable" refer to proteases of the present 
invention that retain a specified amount of enzymatic activity after exposure to identified 
temperatures over a given period of time under conditions prevailing during the proteolytic, 

15 hydroiyzing, cleaning or other process of the invention, for example while exposed altered 
temperatures. Altered temperatures includes increased or decreased temperatures; In 
some embodiments, the proteases retain at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 
92%, 95%, 96%, 97%, 98% or 99% proteolytic activity after exposure to altered 
temperatures over a given time period, for example, at least 60 minutes, 120 minutes, 180 

20 minutes, 240 minutes, 300 minutes, etc. In some embodiments, the thermostability is 
determined as described in the Examples. 

The term "enhanced stability" in the context of an oxidation, chelator, thermal and/or 
pH stable protease refers to a higher retained proteolytic activity over time as compared to 
other serine proteases (e.g., subtilisin proteases) and/or wild-type enzymes. 

25 The term "diminished stability" in the context of an oxidation, chelator, thermal and/or 

pH stable protease refers to a lower retained proteolytic activity over time as compared to 
other serine proteases (e.g., subtilisin proteases) and/or wild-type enzymes. 

As used herein, the term "cleaning composition" includes, unless otherwise 
indicated, granular or powder-form all-purpose or "heavy-duty" washing agents, especially 

30 cleaning detergents; liquid, gel or paste-form all-purpose washing agents, especially the so- 
called heavy-duty liquid types; liquid fine-fabric detergents; hand dishwashing agents or light 
duty dishwashing agents, especially those of the high-foaming type; machine dishwashing 
agents, including the various tablet, granular, liquid and rinse-aid types for household and 
institutional use; liquid cleaning and disinfecting agents, including antibacterial hand-wash 

35 types, cleaning bars, mouthwashes, denture cleaners, car or carpet shampoos, bathroom 
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cleaners; hair shampoos and hair-rinses; shower gels and foam baths and metal cleaners; 
as well as cleaning auxiliaries such as bleach additives and "stain-stick 11 or pre-treat types. 

It is to be understood that the test methods described in the Examples herein are 
used to determine the respective values of the parameters of the present invention, as such 
invention is described and claimed herein. 

Unless otherwise noted, all component or composition levels are in reference to the 
active level of that component or composition, and are exclusive of impurities, for example, 
residual solvents or by-products, which may be present in commercially available sources. 

Enzyme components weights are based on total active protein. 

All percentages and ratios are calculated by weight unless otherwise indicated. All 
percentages and ratios are calculated based on the total composition unless otherwise 
indicated. 

It should be understood that every maximum numerical limitation given throughout 
this specification includes every lower numerical limitation, as if such lower numerical 
limitations were expressly written herein. Every minimum numerical limitation given 
throughout this specification will include every higher numerical limitation, as if such higher 
numerical limitations were expressly written herein. Every numerical range given throughout 
this specification will include every narrower numerical range that falls within such broader 
numerical range, as if such narrower numerical ranges were all expressly written herein. 

The term "cleaning activity" refers to the cleaning performance achieved by the 
protease under conditions prevailing during the proteolytic, hydrolyzing, cleaning or other 
process of the invention. In some embodiments, cleaning performance is determined by the 
application of various cleaning assays concerning enzyme sensitive stains, for example 
grass, blood, milk, or egg protein as determined by various chromatographic, 
spectrophotometric or other quantitative methodologies after subjection of the stains to 
standard wash conditions. Exemplary assays include, but are not limited to those described 
in WO 99/3401 1, and U.S. Pat. 6,605,458 (both of which are herein incorporated by 
reference), as well as those methods included in the Examples. 

The term "cleaning effective amount" of a protease refers to the quantity of protease 
described hereinbefore that achieves a desired level of enzymatic activity in a specific 
cleaning composition. Such effective amounts are readily ascertained by one of ordinary 
skill in the art and are based on many factors, such as the particular protease used, the 
cleaning application, the specific composition of the cleaning composition, and whether a 
liquid or dry (e.g., granular, bar) composition is required, etc. 

The term "cleaning adjunct materials," as used herein, means any liquid, solid or 
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gaseous material selected for the particular type of cleaning composition desired and the 
form of the product (e.g., liquid, granule, powder, bar, paste, spray, tablet, gel; or foam 
composition), which materials are also preferably compatible with the protease enzyme used 
in the composition. In some embodiments, granular compositions are in "compact" form, 
5 while in other embodiments, the liquid compositions are in a "concentrated" form. 

The term "enhanced performance" in the context of cleaning activity refers to an 
increased or greater cleaning activity of certain enzyme sensitive stains such as egg, milk, 
grass or blood, as determined by usual evaluation after a standard wash cycle and/or 
multiple wash cycles. 

10 The term "diminished performance" in the context of cleaning activity refers to an 

decreased or lesser cleaning activity of certain enzyme sensitive stains such as egg, milk, 
grass or blood, as determined by usual evaluation after a standard wash cycle. 

The term "comparative performance" in the context of cleaning activity refers to at 
least 60%, at least 70%, at least 80% at least 90% at least 95% of the cleaning activity of a 

is comparative subtilisin protease {e.g., commercially available proteases), including but not 
limited to OPTIMASE™ protease (Genencor), PURAFECT ™ protease products 
(Genencor), SAVINASE ™ protease (Novozymes), BPN'-variants (See e.g., U.S. Pat. No. 
Re 34,606), RELASE™, DURAZYME™, EVERLASE™, KANNASE ™ protease 
(Novozymes), MAXACAL™, MAXAPEM™, PROPERASE ™ proteases (Genencor; See 

20 also, U.S. Pat. No. Re 34,606, U.S. Pat. Nos. 5,700,676; 5,955,340; 6,312,936; 6,482,628), 
and B. lentus variant protease products [for example those described in WO 92/21760, WO 
95/23221 and/or WO 97/07770 (Henkel). Exemplary subtilisin protease variants include, but 
are not limited to those having substitutions or deletions at residue positions equivalent to 
positions 76, 101, 103, 104, 120, 159, 167, 170, 194, 195, 217, 232, 235, 236, 245, 248, 

25 and/or 252 of BPN\ Cleaning performance can be determined by comparing the proteases 
of the present invention with those subtilisin proteases in various cleaning assays 
concerning enzyme sensitive stains such as grass, blood or milk as determined by usual 
spectrophotometric or analytical methodologies after standard wash cycle conditions. 

As used herein, a "low detergent concentration" system includes detergents where 

30 less than about 800 ppm of detergent components are present in the wash water. 

Japanese detergents are typically considered low detergent concentration systems, as they 
have usually have approximately 667 ppm of detergent components present in the wash 
water. 

As used herein, a "medium detergent concentration" systems includes detergents 
35 wherein between about 800 ppm and about 2000ppm of detergent components are present 
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in the wash water. North American detergents are generally considered to be medium 
detergent concentration systems as they have usually approximately 975 ppm of detergent 
components present in the wash water. Brazilian detergents typically have approximately 
1500 ppm of detergent components present in the wash water, 
s As used herein, "high detergent concentration" systems includes detergents wherein 

greater than about 2000 ppm of detergent components are present in the wash water. 
European detergents are generally considered to be high detergent concentration systems 
as they have approximately 3000-8000 ppm of detergent components in the wash water. 
As used herein, "fabric cleaning compositions" include hand and machine laundry 
10 detergent compositions including laundry additive compositions and compositions suitable 
for use in the soaking and/or pretreatment of stained fabrics {e.g., clothes, linens, and other 
textile materials). 

As used herein, "non-fabric cleaning compositions" include non-textile (i.e., fabric) 
surface cleaning compositions, including but not limited to dishwashing detergent 

is compositions, oral cleaning compositions, denture cleaning compositions, and personal 
cleansing compositions. 

The "compact" form of the cleaning compositions herein is best reflected by density 
and, in terms of composition, by the amount of inorganic filler salt. Inorganic filler salts are 
conventional ingredients of detergent compositions in powder form. In conventional 

20 detergent compositions, the filler salts are present in substantial amounts, typically 17-35% 
by weight of the total composition. In contrast, in compact compositions, the filler salt is 
present in amounts not exceeding 15% of the total composition. In some embodiments, the 
filler salt is present in amounts that do not exceed 10%, or more preferably, 5%, by weight 
of the composition. In some embodiments, the inorganic filler salts are selected from the 

25 alkali and alkaline-earth-metal salts of sulfates and chlorides. A preferred filler salt is 
sodium sulfate. 

II. Serine Protease Enzymes and Nucleic Acid Encoding Serine Protease 
30 Enzymes 

The present invention provides isolated polynucleotides encoding amino acid 
sequences, encoding proteases. In some embodiments, these polynucleotides comprise at 
least 65% amino acid sequence identity, preferably at least 70% amino acid sequence 
identity, more preferably at least 75% amino acid sequence identity, still more preferably at 
35 least 80% amino acid sequence identity, more preferably at least 85% amino acid sequence 
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identity, even more preferably at least 90% amino acid sequence identity, more preferably at 
least 92% amino acid sequence identity, yet more preferably at least 95% amino acid 
sequence identity, more preferably at least 97% amino acid sequence identity, still more 
preferably at least 98% amino acid sequence identity, and most preferably at least 99% 
5 amino acid sequence identity to an amino acid sequence as shown in SEQ ID NOS:6-8, 
(e.g., at least a portion of the amino acid sequence encoded by the polynucleotide having 
proteolytic activity, including the mature protease catalyzing the hydrolysis of peptide 
linkages of substrates), and/or demonstrating comparable or enhanced washing 
performance under identified wash conditions. 

10 In some embodiments, the percent identity (amino acid sequence, nucleic acid 

sequence, gene sequence) is determined by a direct comparison of the sequence 
information between two molecules by aligning the sequences, counting the exact number 
of matches between the two aligned sequences, dividing by the length of the shorter 
sequence, and multiplying the result by 100. Readily available computer programs find use 

15 in these analysis, such as those described above. Programs for determining nucleotide 
sequence identity are available in the Wisconsin Sequence Analysis Package, Version 8 
(Genetics Computer Group, Madison, Wl) for example, the BESTFIT, FASTA and GAP 
programs, which also rely on the Smith and Waterman algorithm. These programs are 
readily utilized with the default parameters recommended by the manufacturer and. 

20 described in the Wisconsin Sequence Analysis Package referred to above. 

An example of an algorithm that is suitable for determining sequence similarity is the 
BLAST algorithm, which is described in Altschul, etal., J. Mol. Biol., 215:403-410 (1990). 
Software for performing BLAST analyses is publicly available through the National Center 
for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first 

25 identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the 
query sequence that either match or satisfy some positive-valued threshold score T when 
aligned with a word of the same length in a database sequence. These initial neighborhood 
word hits act as starting points to find longer HSPs containing them. The word hits are 
expanded in both directions along each of the two sequences being compared for as far as 

30 the cumulative alignment score can be increased. Extension of the word hits is stopped 
when: the cumulative alignment score falls off by the quantity X from a maximum achieved 
value; the cumulative score goes to zero or below; or the end of either sequence is reached. 
The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the 
alignment. The BLAST program uses as defaults a wordlength (W) of 11, the BLOSUM62 

35 scoring matrix (See, Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89. 10915 (1989)) 
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alignments (B) of 50, expectation (E) of 10, M'5, N'-4, and a comparison of both strands. 

The BLAST algorithm then performs a statistical analysis of the similarity between, 
two sequences (See e.g., Karlin and Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 
[1993]). One measure of similarity provided by the BLAST algorithm is the smallest sum 

5 probability (P(N)), which provides an indication of the probability by which a match between 
two nucleotide or amino acid sequences would occur by chance. For example, a nucleic 
acid is considered similar to a serine protease nucleic acid of this invention if the smallest 
sum probability in a comparison of the test nucleic acid to a serine protease nucleic acid is 
less than about 0.1, more preferably less than about 0.01, and most preferably less than 

10 about 0.001 . Where the test nucleic acid encodes a serine protease polypeptide, it is 

considered similar to a specified serine protease nucleic acid if the comparison results in a 
smallest sum probability of less than about 0.5, and more preferably less than about 0.2. 

In some embodiments of the present invention, sequences were analyzed by BLAST 
and protein translation sequence tools. In some experiments, the preferred version was 

is BLAST (Basic BLAST version 2.0). The program chosen was "BlastX", and the database 
chosen was u nr". Standard/default parameter values were employed. 

In some preferred embodiments, the present invention encompasses the . 
approximately 1621 base pairs in length polynucleotide set forth in SEQ. ID NO:1, A start 
codon is shown in bold in SEQ ID NO:1 . In another embodiment of the present invention, 

20 the polynucleotides encoding these amino acid sequences comprise a 1485 base pair 

portion (residues 1-1485 of SEQ ID NO:2) that, if expressed, is believed to encode a signal 
sequence (nucleotides 1-84 of SEQ ID NO:5) encoding amino acids 1-28 of SEQ ID NO:9; 
an N-terminal prosequence (nucleotides 84-594 encoding amino acid residues 29-198 of 
SEQ ID NO:6); a mature protease sequence (nucleotides 595-1161 of SEQ ID NO:2 

25 encoding amino acid residues 1-189 of SEQ ID NO:8); and a C-terminal pro-sequence 
(nucleotides 1162-1486 encoding amino acid residues 388-495 of SEQ ID NO:6). 
Alternatively, the signal peptide, the N-terminal pro-sequence, mature serine protease 
sequence and C-terminal pro-sequence is numbered in relation to the amino acid residues 
of the mature protease of SEQ ID NO:6 being numbered 1-189, i.e., signal peptide (residues 

30 -198 to -171 ), an N-terminal pro sequence (residues -171 to -1), the mature serine 

protease sequence (residues 1-189) and a C-terminal pro-sequence (residues 190-298). In 
another embodiment of the present invention, the polynucleotide encoding an amino acid 
sequence having proteolytic activity comprises a nucleotide sequence of nucleotides 1 to 
1485 of the portion of SEQ ID NO:2 encoding the signal peptide and precursor protease. In 

35 another embodiment of the present invention, the polynucleotide encoding an amino acid 



WO 2005/052146 PCT/US2004/039066 



-60- 

sequence comprises the sequence of nucleotides 1 to 1412 of the polynucleotide encoding 
the precursor Cellulomonas protease (SEQ ID NO:3). In yet another embodiment, the 
polynucleotide encoding an amino acid sequence comprises the sequence of nucleotides 1 
to 587 of the portion of the polynucleotide encoding the mature Cellulomonas protease 
5 (SEQ ID NO:4). 

As will be understood by the skilled artisan, due to the degeneracy of the genetic 
code, a variety of polynucleotides can encode the signal peptide, precursor protease and/or 
mature protease provided in SEQ ID NOS:6, 7, and/or 8, respectively, or a protease having 
the % sequence identity described above. Another embodiment of the present invention 

10 encompasses a polynucleotide comprising a nucleotide sequence having at least 70% 
sequence identity, at least 75% sequence identity, at least 80% sequence identity, at least 
85% sequence identity, at least 90% sequence identity, at least 92% sequence identity, at 
least 95% sequence identity, at least 97% sequence identity, at least 98% sequence identity 
and at least 99% sequence identity to the polynucleotide sequence of SEQ ID NOS:2, 3, 

is and/or 4, respectively, encoding the signal peptide and precursor protease, the precursor 
protease and/or the mature protease, respectively. 

In additional embodiments, the present invention provides fragments or portions of 
DNA that encodes proteases, so long as the encoded fragment retains proteolytic activity. 
Another embodiment of the present invention encompasses polynucleotides having at least 

20 20% of the sequence length, at least 30% of the sequence length, at least 40% of the 

sequence length, at least 50% of the sequence length, at least 60% of the sequence length, 
70% of the sequence length, at least 75% of the sequence length, at least 80% of the 
sequence length, at least 85% of the sequence length, at least 90% of the sequence length, 
at least 92% of the sequence length, at least 95% of the sequence length, at least 97% of 

25 the sequence length, at least 98% of the sequence length and at least 99% of the sequence 
of the polynucleotide sequence of SEQ ID NO:2, or residues 185-1672 of SEQ ID NO:1, 
encoding the precursor protease. In alternative embodiments, these fragments or portions 
of the sequence length are contiguous portions of the sequence length, useful for shuffling 
of the DNA sequence in recombinant DNA sequences (See e.g., U.S. Pat. No. 6,132,970) 

30 Another embodiment of the invention includes fragments of the DNA described 

herein that find use according to art recognized techniques in obtaining partial length DNA 
fragments capable of being used to isolate or identify polynucleotides encoding mature 
protease enzyme described herein from Cellulomonas 69B4, or a segment thereof having 
proteolytic activity. Moreover, the DNA provided in SEQ ID NO:1 finds use in identifying 
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homologous fragments of DNA from other species, and particularly from Cellulomonas spp. 
which encode a protease or portion thereof having proteolytic activity. 

In addition, the present invention encompasses using primer or probe sequences 
constructed from SEQ ID NO:1 , or a suitable portion or fragment thereof (e.g., at least about 
5 5-20 or 10-15 contiguous nucleotides), as a probe or primer for screening nucleic acid of 
either genomic or cDNA origin. In some embodiments, the present invention provides DNA 
probes of the desired length (i.e., generally between 100 and 1000 bases in length), based 
on the sequences in SEQ ID NOS1 , 2, 3, and/or 4. 

In some embodiments, the DNA fragments are electrophoretically isolated, cut from 

10 the gel, and recovered from the agar matrix of the gel. In preferred embodiments, this 
purified fragment of DNA is then labeled (using, for example, the Megaprime labeling 
system according to the instructions of the manufacturer) to incorporate P 32 in the DNA. 
The labeled probe is denatured by heating to 95°C for a given period of time (e.g., 5 
minutes), and immediately added to the membrane and prehybridization solution. The 

15 hybridization reaction proceeds for an appropriate time and under appropriate conditions 
(e.g., 18 hours at 37 e C), with gentle shaking or rotation. The membrane is rinsed (e.g., 
twice in SSC/0.3% SDS) and then washed in an appropriate wash solution with gentle 
agitation. The stringency desired is a reflection of the conditions under which the 
membrane (filter) is washed. In some embodiments herein, "low-stringency" conditions 

20 involve washing with a solution of 0.2X SSC/0.1 % SDS at 20°C for 1 5 minutes, while in 
other embodiments, "medium-stringency" conditions, involve a further washing step 
comprising washing with a solution of 0.2X SSC/0.1 % SDS at 37°C for 30 minutes, while in 
other embodiments, "high-stringency" conditions involve a further washing step comprising 
washing with a solution of 0.2X SSC/0.1 % SDS at 37°C for 45 minutes, and in further 

25 embodiments,, "maximum-stringency" conditions involve a further washing step comprising 
washing with a solution of 0.2X SSC/0.1 % SDS at 37'C for 60 minutes. Thus, various 
embodiments of the present invention provide polynucleotides capable of hybridizing to a 
probed derived from the nucleotide sequence provided in SEQ ID NOS:1, 2, 3, 4, and/or 5, 
under conditions of medium, high and/or maximum stringency. 

30 After washing, the membrane is dried and the bound probe detected. If P 32 or 

another radioisotope is used as the labeling agent, the bound probe is detected by 
autoradiography. Other techniques for the visualization of other probes are well-known to 
those of skill in the art. The detection of a bound probe indicates a nucleic acid sequence 
has the desired homology, and therefore identity to SEQ ID NOS:1 , 2, 3, 4, and/or 5, and is 

35 encompassed by the present invention. Accordingly, the present invention provides 
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methods for the detection of nucleic acid encoding a protease encompassed by the present 
invention which comprises hybridizing part or all of a nucleic acid sequence of SEQ ID 
NOS:1, 2, 3, 4, and/or 5 with other nucleic acid of either genomic or cDNA origin. 

As indicated above, in other embodiments, hybridization conditions are based on the 
s melting temperature (Tm) of the nucleic acid binding complex, to confer a defined 

"stringency 0 as explained below. "Maximum stringency" typically occurs at about Tm-5°C 
(5°C below the Tm of the probe); "high stringency" at about 5'C to 10°C below Tm; 
"intermediate stringency" at about 10'C to 20'C below Tm; and "low stringency" at about 20' 
C to 25°C below Tm. As known to those of skill in the art, medium, high and/or maximum 

10 stringency hybridization are chosen such that conditions are optimized to identify or detect 
polynucleotide sequence homologues or equivalent polynucleotide sequences. 

In yet additional embodiments, the present invention provides nucleic acid constructs 
(i.e., expression vectors) comprising the polynucleotides encoding the proteases of the 
present invention. In further embodiments, the present invention provides host cells 

is transformed with at least one of these vectors. 

In. further embodiments, the present invention provides polynucleotide sequences 
further encoding a signal sequence. In some embodiments, invention encompasses 
polynucleotides having signal activity comprising a nucleotide sequence having at least 65% 
sequence identity, at least 70% sequence identity, preferably at least 75% sequence 

20 identity, more preferably at least 80% sequence identity, still further preferably at least 85% 
sequence identity, even more preferably at least 90% sequence identity, more preferably at 
least 95% sequence identity, more preferably at least 97% sequence identity, at least 98% 
sequence identity, and most preferably at least 99% sequence identity to SEQ ID NO:5. 
Thus, in these embodiments, the present invention provides a sequence with a putative 

25 signal sequence, and polynucleotides being capable of hybridizing to a probe derived from 
the nucleotide sequence disclosed in SEQ ID NO:5 under conditions of medium, high and/or 
maximal stringency, wherein the signal sequences have substantially the same signal 
activity as the signal sequence encoded by the polynucleotide of the present invention. 

In some embodiments, the signal activity is indicated by substantially the same level 

30 of secretion of the protease into the fermentation medium, as the starting material. For 
example, in some embodiments, the present invention provides fermentation medium 
protease levels at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 
95%, or at least 98% of the secreted protease levels in the fermentation medium as 
provided by the signal sequence of SEQ ID NO:3. In some embodiments, the secreted 

35 protease levels are ascertained by protease activity analyses such as the pNA assay (See 
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heterologous or homologous pro,e,n in a Gram-posKive hos, ce» and detecUng secrete! 
protetns mclude using either polyclonal or monoctona, antibodies spec* tor toe 
EM ,nc,ude en 2 yme«ed immunosorbent assay (EUSA,, radioimmunoassay (R|A) 
s and fluoresce™ activated eel, sorting (FACS), as we,,-known those in the ad 

amino acir 6 ' em "° dimen,S ' ,hS preSem ime "<*" P-*" Potynudeotidas, encoding an 

» D tr B - 3 S ' 9 :' ^ (nUCle0 " deS ™ - SEQ 10 "°* - -»■» in 
SEQ NO.9 nucleobde res,due positions 1 to 85 of SEQ ID NO:2 and /or SEQ ID NO-5 

The mvemton further encompasses nucteic acd sequences which hybrids to fhe nuctefc ' 
- actd sequence shown in SEQ ,D NO:5 under low, medium, high stringency and/o Z^L 

stongency condtoone, but w«ch have substantially the same signal aebvity aa2 sZ Z 

The present .nvention encompasses all such polynucleotides he sequence, 

in further embodimente,lhe presem invention provides polynucleotides, ha, are 

complementary to ,ha nucleotide sequences described herein. Exemplary cdmZelv 
. nucleotide sequences include those that are provided in SEQ ID NOs" -5 °° mp ' emen,ary 

acivtv FUrther ^ °* PreSent Polypeptides having proteolybc 

teas, 75 *«. o acd sequence idenWy, a, leas, 80% amino acid sequence idenffly a, leas, 
am,no acd sequence iden«y, a, leas, 95% amino acid sequence idenMy a, leas, 9tT 
ammo acd sequence iden % to the amino acid sequence of SEQ ID NO- 6 Ue teZt 

NO.8 (,.e., the mature protease). The proteolytic acttvay o, these polypeptides is determine 
ustng methods Known in ma art and inctude such methods as those uTeTa a* 
decern function. ,n further embodimente, me polypeptides are Mated. ,n addHiona, 

,ha, tdemtcal to ammo acd sequence selected from the groop consisting of me amino acd 
sequences o, SEQ ,D NOS:6, 7, or 8. ,„ some further embodiment, to po Te 
rdentical,oportionsofSEQIDNOS:6,7or8. ypepnoeaare 
In some embodimems. me present invention provides isolated polypeptides havlna 
pmteoiybc acMy, comprising toe amino acid sequence approbate, JJT. ids „ 
tengto, as provided in SEQ ,D N0:6 . ln (urtner embod|men(Si „, ^ 
encompasses polypeptides having proteose aCMy comprising the amino acid sequence 
approbate, 467 amino acids in length proved in SEQ ,D MO:7. In soma embodima nt s 
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these amino acid sequences comprise a signal sequence (amino acids 1-28 of SEQ ID 
NO:9); and a precursor protease (amino acids 1-467 of SEQ ID NO:7). In additional 
embodiments, the present invention encompasses polypeptides comprising an N-terminal 
prosequence (amino acids 1-170 of SEQ ID NO:7), a mature protease sequence (amino 
acids 1-189 of SEQ ID NO:8), and a C-terminal prosequence (amino acids 360 -^67 of SEQ 
ID NO:7). In still further embodiments, the present invention encompasses polypeptides 
comprising a precursor protease sequence {e.g., amino acids 1-467 of SEQ ID NO:7). In 
yet another embodiment, the present invention encompasses polypeptides comprising a 
mature protease sequence comprising amino acids {e.g., 1-189 of SEQ ID NO:8). 

In further embodiments, the present invention provides polypeptides and/or 
proteases comprising amino acid sequences of the above described sequence derived from 
bacterial species including, but not limited to Micrococcineae which are identified through 
amino acid sequence homology studies. In some embodiments, an amino acid residue of a 
precursor Micrococcineae protease is equivalent to a residue of Cellulomonas strain 69B4, if 
it is either homologous {i.e., corresponding in position in either primary or tertiary structure) 
or analogous to a specific residue or portion of that residue in Cellulomonas strain 69B4 
protease {i.e., having the same or similar functional capacity to combine, react, or interact 
chemically). 

In some preferred embodiments, in order to establish homology to primary structure, 
the amino acid sequence of a precursor protease is directly compared to the Cellulomonas 
strain 69B4 mature protease amino acid sequence and particularly to a set of conserved 
residues which are discerned to be invariant in all or a large majority of Cellulomonas like 
proteases for which sequence is known. After aligning the conserved residues, allowing for 
necessary insertions and deletions in order to maintain alignment {i.e., avoiding the 
elimination of conserved residues through arbitrary deletion and insertion), the residues 
corresponding to particular amino acids in the mature protease (SEQ ID NO:8) and 
Cellulomonas 69B4 protease are determined. Alignment of conserved residues preferably 
should conserve 100% of such residues. However, alignment of greater than 75% or as 
little as 45% of conserved residues is also adequate to define equivalent residues. 
However, conservation of the catalytic triad, His32/Asp56/Ser137 of SEQ ID NO:8 should be 
maintained. 

For example, in some embodiments, the amino acid sequence of proteases from 
Cellulomonas strain 69B4, and other Micrococcineae spp. described above are aligned to 
provide the maximum amount of homology between amino acid sequences. A comparison 
of these sequences indicates that there are a number of conserved residues contained in 
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each sequence. These are the residues that are identified and utilized to establish the 
equivalent residue positions of amino acids identified in the precursor or mature 
Uicrococcineae protease in question. 

These conserved residues are used ,o ascertain the corresponding amino acid 
. residues of Ce«o m o„as strain 69B4 protease in one o, more in Miaococaneae 

omologues (e.g., CoUutmonaa calasaa (DSM 201 ,8, and/or a CeUomonaa homologue 
here,n). These particular amino acid sequences are aligned with the sequence of 
Catenae 69B4 protease to produce ft. maximum homology of conserved rescues By 
Hits alignment, Ihe sequences and particular residue posflons of CeMomonas 69B4 are 

th™" TT™ ^ ^ ^ lhe «■—. -'no acid ,or 

the catalytic tned (e.g., in CaHulomonas 6964 protease) is identifiable in me other 

Spp - some ^bodhtents of the present invention, the protease 
homologs comprise the equivalent of Hi S 32/Asp56/Ser137 of SEQ ID NO-8 

Another indication that two polypeptides are substantially identical is that the first 
polypeptide is immunologic^ cross-reactive with ta second polype,*,, Methodologies 
or determining immunological cross-,eactiv«yare described in thearfandare described in 
the Examples herein. Typically, p^des ma. dmer by conservative amino acid 
eubsmutions are immunologically cross-reactive, thus, a polypeptide is substantially 
identical to a second polypeptide, for example, where me two peptides differ only by a 
conservative substitution. 

The present invention encompasses proteases obtained from various sources In 
some preferred embodiment me proteases are obtained from bacteria, whi,e in other 
embodiments, me proteases are obtained from fungi. 

In some particularly preferred embodiments, the bacteria, source Is selected from the 
members o, me suborder ,„ ^ embodiments , m ^ 

the family Promicromonosporaceae. In some preferred embodiments the 
ytromnosporaceae spp. includes end/or is selected from the group consisting of 
Promvromonospom citrea (DSM 431 10), Promlcromoncapora sukumoa (DSM 44,2,) 
ytrcnonoapoa aerofafa (CCM 7043), PronXcromonoapo* vrbdbbonensfe (COM 7044) 
Myca^arana^uanaa^m 15700), .aoptoncda vana^ipm 10,77, basonym 
M*"*"*" rarisWte) , CeUosMc^cenutansm* 20424, basonym Ncoam 

tm«m. Wbacferfum ufm/(LMG 2,721), and Xyiani^un, pa^a^ 
12657, basonym Promicromonospora pachnodae). 

In other particularly preferred embodiments, the bacterial source is the family 
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Cellulomonadaceae. In some preferred embodiments, the Cellulomonadaceae spp. includes 
and/or is selected from the group of Cellulomonas fimi (ATCC 484, DSM 201 13), 
Cellulomonas biazotea (ATCC 486, DSM 201 12), Cellulomonas cellasea (ATCC 487, 
21 681 , DSM 201 1 8), Cellulomonas denverensis, Cellulomonas hominis (DSM 9581 ), 
Cellulomonas flavigena (ATCC 482, DSM 20109), Cellulomonas persica (ATCC 700642, 
DSM 14784), Cellulomonas iranensis (ATCC 700643, DSM 14785); Cellulomonas 
fermentans (ATCC 43279, DSM 3133), Cellulomonas gelida (ATCC 488, DSM 201 1 1 , DSM 
20110), Cellulomonas humilata (ATCC 25174, basonym Actinomyces humiferus), 
Cellulomonas uda (ATCC 491, DSM 20107), Cellulomonas xylanilytica (LMG 21723), 
Cellulomonas septica, Cellulomonas parahominls, Oerskovia turbata (ATCC 25835, DSM 
20577, synonym Cellulomonas turbata), Oerskovia jenensis (DSM 46000), Oerskovia 
enterophila (ATCC 35307, DSM 43852, basonym Promicromonospora enterophila), 
Oerskovia paurometabola (DSM 14281), and Cellulomonas strain 69B4 (DSM 16035). In 
further embodiments, the bacterial source also includes and/or is selected from the group of 
Thermobifida spp., Rarobacter spp., and/or Lysobacter spp. In yet additional embodiments, 
the Thermobifida spp. is Thermobifida fusca (basonym Thermomonospora fusca) (tfpA, 
AAC23545; See, Lao et. al, Appl. Environ. Microbiol., 62: 4256-4259 [1996]). In an 
alternative embodiment, the Rarobacter spp. is Rarobacter faecitabidus (RPI, A45053; See 
e.g., Shimoi et al., J. Biol. Chem., 267:25189-25195 [1992]). In yet another embodiment, 
the Lysobacter spp. is Lysobacter enzymogenes. 

In further embodiments, the present invention provides polypeptides and/or 
polynucleotides obtained and/or isolated from fungal sources. In some embodiments, the 
fungal source includes a Metarhizium spp. In some preferred embodiments, the fungal 
source is a Metarhizium anisopliae (CHY1 (CAB60729). 

In another embodiment, the present invention provides polypeptides and/or 
polynucleotides derived from a Cellulomonas strain selected from cluster 2 of the taxonomic 
classification described in U.S. Pat. No 5,401,657, herein incorporated by reference. In US 
Patent 5,401,657, twenty strains of bacteria isolated from in and around alkaline lakes were 
assigned to the type of bacteria known as Gram-positive bacteria on the basis of: (1) the 
Dussault modification of the Gram's staining reaction (Dussault, J. Bacteriol., 70:484-485 
[1955]); (2) the KOH sensitivity test (Gregersen, Eur. J. Appl. Microbiol. Biotechnol., 5:123- 
127 [1978]; Halebian etal., J. Clin. Microbiol., 13:444-448 [1981]; and (3) the 
aminopeptidase reaction (Cerny, Eur. J. Appl. Microbiol., 3:223-225 [1976]; Cerny, Eur. J. 
Appl. Microbiol., 5:1 13-122 [1978]). In addition, in most cases, confirmation was also made 
on the basis of quinone analysis (Collins and Jones, Microbiol. Rev., 45:316-354 [1981]) 
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using the method described by Collins (See, Collins, In Goodfellow and Minnikin (eds), 
Chemical M ethods in Bacterial Svstematics, Academic Press, London [1985], pp. 267-288). 
In addition, strains can be tested for 200 characters and the results analyzed using the 
principles of numerical taxonomy (See e.g., Sneath and Sokal, Numerical Taxonomy , w h 
Freeman & Co.,. San Francisco, CA [1 973]). Exemplary characters tested, testing 
methods, and codification methods are also described in U.S. Pat. 5,401 ,657. 

As described in U.S. Pat. No. 5,401,657, the phenetic data, consisting of 200 unit 
characters was scored and set out in the form of an "n.times.t" matrix, whose t columns 
represent the T bacterial strains to be grouped on the basis of resemblances, and whose 
*'n" rows are the unit characters. Taxonomic resemblance of the bacterial strains was 
estimated by means of a.similarity coefficient (Sneath and Sokal, supra, pp. 114-187). 
Although many different coefficients have been used for biological classification, only a few 
have found regular use in bacteriology. Three association coefficients (See e.g., Sneath 
and Sokal, supra, at p. 129), namely, the Gower, Jaccard and Simple Matching coefficients 
were applied. These have been frequently applied to the analysis of bacteriological data and 
are widely accepted by those skilled in the art, as they have been shown to result in robust 
classifications. 

The coded data were analyzed using the TAXPAK program package (Sackin, Meth. 
Microbiol., 19:459-494 [1987]), run on a DEC VAX computer at the University of Leicester, 
U.K. 

A similarity matrix was constructed for all pairs of strains using the Gower Coefficient 
(S Q ) with the option of permitting negative matches (See, Sneath and Sokal, supra, at pp. 
135-136), using the RTBNSIM program in TAXPAK. As the primary instrument of analysis 
and the one upon which most of the taxonomic data presented herein are based, the Gower 
Coefficient was chosen over other coefficients for generating similarity matrices because it 
is applicable to all types of characters or data, namely, two-state, multistate (ordered and 
qualitative), and quantitative. 

Cluster analysis of the similarity matrix was accomplished using the Unweighted Pair 
Group Method with Arithmetic Averages (UPGMA) algorithm, also known as the Unweighted 
Average Linkage procedure, by running the SMATCLST sub-routine in TAXPAK. 

Dendrograms illustrate the levels of similarity between bacterial strains In some 
embodiments, dendrograms are obtained by using the DENDGR program in TAXPAK. The 
phenetic data were re-analyzed using the Jaccard Coefficient (Sj) (Sneath and Sokal, supra, 
at p.131) and Simple Matching Coefficient (S SM ) (Sneath, P.H.A. and Sokal, R.R., ibid, p. 
132) by running the RTBNSIM program in TAXPAK. An additional two dendrograms were 
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obtained by using the SMATCLST with UPGMA option and DENDGR sub-routines in 
TAXPAK. 

Using the S G /UPGMA method, six natural clusters or phenons of alkalophilic 
bacteria were generated at the 79% similarity level. These six clusters included 15 of the 20 

5 alkalophilic bacteria isolated from alkaline lakes. Although the choice of 79% for the level of 
delineation was arbitrary, it was in keeping with current practices in numerical taxonomy 
{See e.g., Austin Priest, Modern Bacterial Taxonomy . Van Nostrand Reinhold, Wokingham, 
U.K., [1986], p. 37). Placing the delineation at a lower percentage would combine groups of 
clearly unrelated organisms whose definition is not supported by the data. At the 79% level, 

10 3 of the clusters exclusively contain novel alkalophilic bacteria representing 13 of the newly 
isolated strains (potentially representing new taxa). Protease 69B4 was classified as in 
cluster 2 by this method. 

The significance of the clustering at this level was supported by the results of the 
TESTDEN program. This program tests the significance of all dichotomous pairs of clusters 

is (comprising 4 or more strains) in a UPGMA.generated dendrogram with Squared Euclidean 
distances, or their complement as a measurement and assuming that the clusters are 
hyperspherical. The critical overlap was set at 0.25%. The separation of the clusters is 
highly significant. . 

The Sj coefficient is a useful adjunct to the S G coefficient, as it can be used to detect " 

20 phenons in the latter that are based on negative matches or distortions owing to undue 
weight being put on potentially subjective qualitative data. Consequently, the Sj coefficient 
is useful for confirming the validity of clusters defined initially by the use of the Sg 
coefficient. The Jaccard Coefficient is particularly useful in comparing biochemically 
unreactive organisms (Austin and Priest, supra, at p. 37). In addition, there may be some 

25 question about the admissibility of matching negative character states (See, Sneath and 
Sokal, supra, at p. 131), in which case the Simple Matching Coefficient is a widely applied 
alternative. Strain 69B4 was classified as in cluster 2 by this method. 

In the main, all of the clusters (especially the clusters of the new bacteria) generated 
by the S G /UPGMA method were recovered in the dendrograms produced by the Sj 

30 /UPGMA method (cophenetic correlation, 0.795), and the S S m /UPGMA method (cophenetic 
correlation, 0.814). The main effect of these transformations was to gather all the Bacillus 
strains in a single large cluster which further serves to emphasize the separation between 
the alkalophilic Bacillus species and the new alkalophilic bacteria, and the uniqueness of the 
latter. Based on these methodologies, 69B4 is considered to be a cluster 2 bacterium. 

35 In other aspects of the present invention, the polynucleotide is derived from a 
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bacteria having a 16S rRNA gene nucleotide sequence at least 70%, 75%, 80%, 85%, 88%, 
90%, 92%, 95%, 98% sequence identity with the 16S rRNA gene nucleotide sequence of 
Cellulomonas strain 69B4. The sequence of the 16S rRNA gene is deposited at GenBank 
under Accession Number X92152. 

Figure 1 provides an unrooted phylogenetic tree illustrating the relationship of novel 
strain 69B4 to members of the family Cellulomonadaceae (including Cellulomonas strain 
69B4) and other related genera of the suborder Micrococcineae. The dendrogram was 
constructed from aligned 16S rDNA sequences (1374 nt) using TREECONW v. 1.3b (Van de 
Peer and De Wachter, Comput. Appl. Biosci., 10: 569-570 [1994]). Distance estimations 
were calculated using the substitution rate calibration of Jukes and Cantor (Jukes and 
Cantor, "Evolution of protein molecules," In, Munro (ed.), Mammalian Protein Metabolism, 
Academic Press, NY, at pp.21 -132, [1969]) and tree topology inferred by the Neighbor- 
Joining algorithm (Saitou and Nei, Mol. Biol. Evol., 4:406-425 [1987]). The numbers at the 
nodes refer to bootstrap values from 100 resampled data sets (Felsenstein, Evol., 39:783- 
789 [1 985]) and the scale bar indicates 2 nucleotide substitutions in 1 00 nt. 

The strain 69B4 exhibits the closest 16S rDNA relationship to members of 
Cellulomonas and Oerskovia of the family Cellulomonadaceae. The closest relatives are 
believed to be C. cellasea (DSM 20118) and C. fimi (DSM 20113), with at least 95% 
sequence identity with the 16S rRNA gene nucleotide sequence of Cellulomonas strain 
69B4 (e.g., 96% and 95% identity respectively) to strain 69B4 16S rRNA gene sequence. 

In some preferred embodiments of the present invention, the Cellulomonas spp. is 
Cellulomonas strain 69B4 (DSM16035). This strain was originally isolated from a sample of 
sediment and water from the littoral zone of Lake Bogoria, Kenya at Acacia Camp (Lat. 0° 
12'N, Long. 36° 07'E) collected on 10 October 1988. The water temperature was 33°C, pH 
1 0.5 with a conductivity of 44 mS/cm. Cellulomonas strain 69B4 was determined to have 
the phenotypic characteristics described below. Fresh cultures were Gram-positive, slender, 
generally straight, rod-shaped bacteria, approximately 0.5-0.7|im x 1 .8-4nm. Older cultures 
contained mainly short rods and coccoid cells. Cells occasionally occurred in pairs or as V- 
forms, but primary branching was not observed. Endospores were not detected. On 
alkaline GAM agar the strain forms opaque, glistening, pale-yellow coloured, circular and 
convex or domed colonies, with entire margins, about 2 mm in diameter after 2-3 days 
incubation at 37°C. The colonies were viscous or slimy with a tendency to clump when 
scraped with a loop. On neutral Tryptone Soya Agar, strain growth was less vigorous, 
giving translucent yellow colonies, generally <1 mm in diameter. The cultures were 
facultatively anaerobic, as they were capable of growth under strictly anaerobic conditions. 
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However, growth under anaerobic conditions was markedly reduced compared to aerobic 
growth. The strain also appeared to be negative in standard oxidase, urease, 
aminopeptidase, and KOH tests. In addition, nitrate was not reduced, although the 
organisms were catalase positive and DNase was produced under alkaline conditions. The 
5 preferred temperature range for growth was 20 - 37°C, with an optimum temperature at 
around 30-37°C. No growth was observed at 15°C or 45°C. 

The strain is alkalophilic and slightly halophilic. The strain may also be characterized 
as having growth occurring at pH values between 6.0 and 10.5 with an optimum around pH 
9-10. No growth was observed at pH 1 1 or pH 5.5. Growth below pH 7 was less vigorous 

10 and abundant than that of cultures grown at the optimal temperature. The strain was 

observed to grow in medium containing 0-8% (w/v) NaCI. Furthermore, the strain may also 
be characterized as a chemo-organotroph, since it grew on complex substrates such as 
yeast extract and peptone; and hydrolyzed starch, gelatin, casein, carboxymethylcellulose 
and amorphous cellulose. 

15 The strain was observed to have metabolism that was respiratory and also 

fermentative. Acid was produced both aerobically and anaerobically from (API 50CH): L- 
arabinose, D-xylose, D-glucose, D-fructose, D-mannose, rhamnose (weak), cellobiose, 
maltose, sucrose, trehalose, gentiobiose, D-turanose, D-lyxose and 5-keto-gluconate 
(weak). Amygdalin, arbutin, salicin and esculin are also utilized. The strain was unable to 

20 utilize: ribose, lactose, galactose, melibiose, D-raffinose, glycogen, glycerol, erythritbl, 
inositol, mannitol, sorbitol, xylitol, arabitol, gluconate and lactate. 

The strain was determined to be susceptible to ampicillin, chloramphenicol, 
erythromycin, fusidic acid, methicillin, novobiocin, streptomycin, tetracycline, sulphafurazole, 
oleandomycin, polymixin, rifampicin, vancomycin and bacitracin; but resistant to gentamicin, 

25 nitrofurantoin, nalidixic acid, sulphmethoxazole, trimethoprim, penicillin G, neomycin and 
kanamycin. 

The following enzymes, aside from the protease of the present invention, were 
observed to be produced (ApiZym, API Coryne); C4-esterase, C8-esterase/lipase, leucine 
arylamidase, alpha-chymotrypsin, alpha-glucosidase, beta-glucosidase and pyrazinamidase. 

30 The strain was observed to exhibit the following chemotaxonomic characteristics. 

Major fatty acids (>10% of total) were C16:1 (28.1%), C18:0 (31.1%), C18:1 (13.9%). N- 
saturated (79.1%), n-unsaturated (19.9%). Fatty acids with even numbers of carbons 
accounted for 98%. Main polar lipid components: phosphatidylglycerol (PG) and 3 
unidentified glycolipids (alpha-napthol positive) were present; DPG, PGP, PI and PE were 

35 not detected. Menaquinones MK-4, MK-6, MK-7 and MK-9 were the main isoprenoids 
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present. The cell wall peptidoglycan type was A40 with L-ornithine as diamino acid and D- 
aspartic acid in the interpeptide bridge. With regard to toxicity evaluation, there are no 
known toxicity or pathogenicity issues associated with bacteria of the genus Cellulomonas. 

Although there may be variations in the sequence of a naturally occurring enzyme 
within a given species of organism, enzymes of a specific type produced by organisms of 
the same species generally are substantially identical with respect to substrate specificity 
and/or proteolytic activity levels under given conditions {e.g., temperature, pH, water 
hardness, oxidative conditions, chelating conditions, and concentration), etc. Thus, for the 
purposes of the present invention, it is contemplated that other strains and species of 
Cellulomonas also produce the Cellulomonas protease of the present invention and thus 
provide useful sources for the proteases of the present invention. Indeed, as presented 
herein, it is contemplated that other members of the Micrococcineae will find use in the 
present invention. 

In some embodiments, the proteolytic polypeptides of this invention are 
characterized physicochemically, while in other embodiments, they are characterized based 
on their functionally, while in further embodiments, they are characterized using both sets of 
properties. Physicochemical characterization takes advantages of well known techniques 
such as SDS electrophoresis, gel filtration, amino acid composition, mass spectrometry 
(e.g,. MALDI-TOF-MS, LC-ES-MS/MS, etc.), and sedimentation to determine the molecular 
weight of proteins, isoelectric focusing to determine the pi of proteins, amino acid 
sequencing to determine the amino acid sequences of protein, crystallography studies to 
determine the tertiary structures of proteins, and antibody binding to determine antigenic 
epitopes present in proteins. 

In some embodiments, functional characteristics are determined by techniques well 
known to the practitioner in the protease field and include, but are not limited to, hydrolysis 
of various commercial substrates, such as di-methyl casein ("DMC") and/or AAPF-pNA. 
This preferred technique for functional characterization is described in greater detail in the 
Examples provided herein. 

In some embodiments of the present invention, the protease has a molecular weight 
of about 17kD to about 21 kD, for example about 18kD to 19kD, for example 18700 daltons 
to 18800 daltons, for example about 18764 daltons, as determined by MALDI-TOF-MS). In 
another aspect of the present invention, the protease measured MALDI-TOF-MS spectrum 
as set forth in Figure 3. 

The mature protease also displays proteolytic activity (e.g., hydrolytic activity on a 
substrate having peptide linkages) such as DMC. In further embodiments, proteases of the 
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present invention provide enhanced wash performance under identified conditions. 
Although the present invention encompasses the protease 69B as described herein, in 
some embodiments, the proteases of the present invention exhibit at least 50%, 60%, 70%, 
75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, §8% or 99% proteolytic activity as compared 

5 to the proteolytic activity of 69B4. In some embodiments, the proteases display at least 
50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98% or 99% proteolytic 
activity as compared to the proteolytic activity of proteases sold under the tradenames 
SAVINASE® (Novzymes) or PURAFECT® (Genencor) under the same conditions. In some 
embodiments, the proteases of the present invention display comparative or enhanced wash 

10 performance under identified conditions as compared to 69B4 under the same conditions. 
In some preferred embodiments, the proteases of the present invention display comparative 
or enhanced wash performance under identified conditions, as compared to proteases sold 
under the tradenames SAVINASE® (Novozymes) or PURAFECT® (Genencor) under the 
same conditions. 

15 In yet further embodiments, the proteases and/or polynucleotides encoding the 

proteases of the present invention are provided purified form (i.e., present in a particular 
composition in a higher or lower concentration than exists in a naturally occurring or wild 
type organism), or in combination with components not normally present upon expression 
from a naturally occurring or wild-type organism. However, it is not intended that the 

20 present invention be limited to proteases of any specific purity level, as ranges of protease 
purity find use in various applications in which the proteases of the present inventing are 
suitable. 

III. Obtaining Polynucleotides Encoding Micrococcineae 
25 {e.g., Cellulomonas) Proteases of the Present Invention 

In some embodiments, nucleic acid encoding a protease of the present invention is 
obtained by standard procedures known in the art from, for example, cloned DNA {e.g., a 
DNA "library 0 ), chemical synthesis, cDNA cloning, PCR, cloning of genomic DNA or 
fragments thereof, or purified from a desired cell, such as a bacterial or fungal species (See, 

30 for example, Sambrook et al., supra [1 989]; and Glover and Hames (eds.), DNA Cloning: A 
Practical Approach . Vols t and 2, Second Edition). Synthesis of polynucleotide sequences 
is well known in the art (See e.g., Beaucage and Caruthers, Tetrahedron Lett., 22:1859- 
1862 [1981]), including the use of automated synthesizers (See e.g., Needham- 
VanDevanter et al., Nucl. Acids Res., 12:6159-6168 [1984]). DNA sequences can also be 

35 custom made and ordered from a variety of commercial sources. As described in greater 
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detail herein, in some embodiments, nucleic acid sequences derived from genomic DNA 
contain regulatory regions in addition to coding regions. 

In some embodiments involving the molecular cloning of the gene from genomic DNA, 
DNA fragments are generated, some of which comprise at least a portion of the desired gene. 
5 In some embodiments, the DNA is cleaved at specific sites using various restriction enzymes. 
In some alternative embodiments, DNAse is used in the presence of manganese to fragment 
the DNA, or the DNA is physically sheared (e.g., by sonication). The linear DNA fragments 
created are then be separated according to size and amplified by standard techniques, 
including but not limited to, agarose and polyacrylamide gel electrophoresis, PCR and column 

10 chromatography. 

Once nucleic acid fragments are generated, identification of the specific DNA 
fragment encoding a protease may be accomplished in a number of ways. For example, in 
some embodiments, a proteolytic hydrolyzing enzyme encoding the asp gene or its specific 
RNA, or a fragment thereof, such as a probe or primer, is isolated, labeled, and then used in 

15 hybridization assays well known to those in the art, to detect a generated gene (See e.g., 
Benton and Davis, Science 196:180 [1977]; and Grunstein and Hogness, Proc. Natl. Acad. 
Sci. USA 72:3961 [1975]). In preferred embodiments, DNA fragments sharing substantial 
sequence similarity to the probe hybridize under medium to high stringency. 

In some preferred embodiments, amplification is accomplished using PCR, as known 

20 in the art. In some preferred embodiments, a nucleic acid sequence of at least about 4 
nucleotides and as many as about 60 nucleotides from SEQ ID NOS:1, 2, 3 and/or 4 (/.e., 
fragments), preferably about 12 to 30 nucleotides, and more preferably about 25 nucleotides 
are used in any suitable combinations as PCR primer. These same fragments also find use 
as probes in hybridization and product detection methods. 

25 In some embodiments, isolation of nucleic acid constructs of the invention from a 

cDNA or genomic library utilizes PCR with using degenerate oligonucleotide primers 
prepared on the basis of the amino acid sequence of the protein having the amino acid 
sequence as shown in SEQ ID NOS:1 -5. The primers can be of any segment length, for 
example at least 4, at least 5, at least 8, at least 15, at least 20, nucleotides in length. 

30 Exemplary probes in the present application utilized a primer comprising a TTGWHCGT and 
a GDSGG polynucleotide sequence as more fully described in Examples. 

In view of the above, it will be appreciated that the polynucleotide sequences 
provided herein and based on the polynucleotide sequences provided in SEQ ID NOS:1-5 
are useful for obtaining identical or homologous fragments of polynucleotides from other 
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species, and particularly from bacteria that encode enzymes having the serine protease 
activity expressed by protease 69B4. 

IV. Expression and Recovery of Serine Proteases of the Present Invention 

Any suitable means for expression and recovery of the serine proteases of the 
present invention find use herein. Indeed, those of skill in the art know many methods 
suitable for cloning a Cellulomonas-defwed polypeptide having proteolytic activity, as well as 
an additional enzyme (e.g., a second peptide having proteolytic activity, such as a protease, 
cellulase, mannanase, or amylase, etc.). Numerous methods are also known in the art for 
introducing at least one (e.g., multiple) copies of the polynucleotide(s) encoding the 
enzyme(s) of the present invention in conjunction with any additional sequences desired, 
into the genes or genome of host cells. 

In general, standard procedures for cloning of genes and introducing exogenous 
. proteases encoding regions (including multiple copies of the exogenous encoding regions) 
into said genes find, use in obtaining a Cellulomonas 69B4 protease derivative or homologue 
thereof. Indeed, the present Specification, including the Examples provides such teaching. 
However, additional methods known in the art are also suitable (See e.g., Sambrdok etal. 
supra (1989); Ausubel et a/., supra [1995]; and Harwood and Cutting, (eds.) Molecular " 
Biological Methods for Bacillus. " John Wiley and Sons, [1990]; and WO 96/34946). 

In some preferred embodiments, the polynucleotide sequences of the present 
invention are expressed by operatively linking them to an expression control sequence in an 
appropriate expression vector and employed by that expression vector to transform an 
appropriate host according to techniques well established in the art. In some embodiments, 
the polypeptides produced on expression of the DNA sequences of this invention are 
isolated from the fermentation of cell cultures and purified in a variety of ways according to 
well established techniques in the art. Those of skill in the art are capable of selecting the 
most appropriate isolation and purification techniques. 

More particularly, the present invention provides constructs, vectors comprising 
polynucleotides described herein, host cells transformed with such vectors, proteases 
expressed by such host cells, expression methods and systems for the production of serine 
protease enzymes derived from microorganisms, in particular, members of the 
Micrococcineae, including but not limited to Cellulomonas species. In some embodiments, 
the polynucleotide(s) encoding serine protease(s) are used to produce recombinant host 
cells suitable for the expression of the serine protease(s). In some preferred embodiments, 
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the expression hosts are capable of producing the protease(s) in commercially viable 
quantities. 



IV. Recombinant Vectors 



5 



As indicated above, in some embodiments, the present invention provides vectors 



comprising the aforementioned polynucleotides. In some embodiments, the vectors (/.e. f 
constructs) of the invention encoding the protease are of genomic origin (e.g., prepared 
though use of a genomic library and screening for DNA sequences coding for all or part of 
the protease by hybridization using synthetic oligonucleotide probes in accordance with 

10 standard techniques). In some preferred embodiments, the DNA sequence encoding the 
protease is obtained by isolating chromosomal DNA from the Cellulomonas strain 69B4 and 
amplifying the sequence by PCR methodology (See, the Examples). 

In alternative embodiments, the nucleic acid construct of the invention encoding the 
protease is prepared synthetically by established standard methods (See e.g., Beaucage 

15 and Caruthers, Tetra. Lett. 22:1859-1869 [1981]; and Matthes et a/ M EMBO J., 3:801-805 
[1984]). According to the phosphoramidite method, oligonucleotides are synthesized (e.g., 
in an automatic DNA synthesizer), purified, annealed, ligated and cloned in suitable vectors.. 

In additional embodiments, the nucleic acid construct is of mixed synthetic and 
genomic origin. In some embodiments, the construct is prepared by ligating fragments of 

20 synthetic or genomic DNA (as appropriate), wherein the fragments correspond to various 
parts of the entire nucleic acid construct, in accordance with standard techniques. 

In further embodiments, the present invention provides vectors comprising at least 
one DNA construct of the present invention. In some embodiments, the present invention 
encompasses recombinant vectors. It is contemplated that any suitable vector will find use 

25 in the present invention, including autonomously replicating vector a well as vectors that 
integrate (either transiently or stably) within the host cell genome). Indeed, a wide variety of 
vectors, and expression cassettes suitable for the cloning, transformation and expression in 
fungal (mold and yeast), bacterial, insect and plant cells are known to those of skill in the 
art. Typically, the vector or cassette contains sequences directing transcription and 

30 translation of the nucleic acid, a selectable marker, and sequences allowing autonomous 
replication or chromosomal integration. In some embodiments, suitable vectors comprise a 
region 5' of the gene which harbors transcriptional initiation controls and a region 3' of the 
DNA fragment which controls transcriptional termination. These control regions may be 
derived from genes homologous or heterologous to the host as long as the control region 

35 selected is able to function in the host cell. 



WO 2005/052146 



PCT/US2004/039066 



-76- 

The vector is preferably an expression vector in which the DNA sequence encoding 
the protease of the invention is operably linked to additional segments required for 
transcription of the DNA. In some preferred embodiments, the expression vector is derived 
from plasmid or viral DNA, or in alternative embodiments, contains elements of both. 
Exemplary vectors include, but are not limited to pSEGCT, pSEACT, and/or pSEA4CT, as 
well as all of the vectors described in the Examples herein. Construction of such vectors is 
described herein, and methods are well known in the art (See e.g., U.S. Pat. No. 6,287,839; 
and WO 02/50245). In some preferred embodiments, the vector pSEGCT (about 8302 bp; 
See, Figure 5) finds use in the construction of a vector comprising the polynucleotides 
described herein (e.g., pSEG69B4T; See, Figure 6). In alternative preferred embodiments, 
the vector pSEA469B4CT (See, Figure 7) finds use in the construction of a vector 
comprising the polynucleotides described herein. Indeed, it is intended that all of the 
vectors described herein will find use in the present invention. 

In some embodiments, the additional segments required for transcription include 
regulatory segments (e.g., promoters, secretory segments, inhibitors, global regulators, 
etc.), as known in the art. One example includes any DNA sequence that shows 
transcriptional activity in the host cell of choice and is derived from genes, encoding proteins 
either homologous or heterologous to the host cell. Specifically, examples of suitable 
promoters for use in bacterial host cells include but are not limited to the promoter of the 
Bacillus stearothermophilus maltogenic amylase gene, the Bacillus amyloliquefaciens (BAN) 
amylase gene, the Bacillus subtilis alkaline protease gene, the Bacillus clausii alkaline 
protease gene the Bacillus pumilus xylosidase gene, the Bacillus thuringiensis crylllA, and 
the Bacillus licheniformis alpha-amylase gene. Additional promoters include the A4 
promoter, as described herein. Other promoters that find use in the present invention 
include, but are not limited to phage Lambda P R or P L promoters, as well as the E. coli lac, 
tip or tac promoters. 

In some embodiments, the promoter is derived from a gene encoding said protease 
or a fragment thereof having substantially the same promoter activity as said sequence. 
The invention further encompasses nucleic acid sequences which hybridize to the promoter 
sequences under intermediate, high, and/or maximum stringency conditions, or which have 
at least about 90% homology and preferably about 95% homology to such promoter, but 
which have substantially the same promoter activity. In some embodiments, this promoter is 
used to promote the expression of either the protease and/or a heterologous DNA sequence 
(e.g., another enzyme in addition to the protease of the present invention). In additional 
embodiments, the vector also comprises at least one selectable marker. 
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In some embodiments, the recombinant vectors of the invention further comprise a 
DNA sequence enabling the vector to replicate in the host cell. In some preferred 
embodiments involving bacterial host cells, these sequences comprise all the sequences 
needed to allow plasmid replication (e.g., ori and/or rep sequences). 

In some particularly preferred embodiments, signal sequences (e.g., leader 
sequence or pre sequence) are also included in the vector, in order to direct a polypeptide of 
the present invention into the secretory pathway of the host cells. In some more preferred 
embodiments, a secretory signal sequence is joined to the-DNA sequence encoding the 
precursor protease in the correct reading frame (See e.g., SEQ ID NOS:1 and 2). 
Depending on whether the protease is to be expressed intracellularly or is secreted, a 
polynucleotide sequence or expression vector of the invention is engineered with or without 
a natural polypeptide signal sequence or a signal sequence which functions in bacteria (e.g., 
Bacillus sp.), fungi (e.g., Trichoderma), other prokaryoktes or eukaryotes. In some 
embodiments, expression is achieved by either removing or partially removing the signal 
sequence 

In some embodiments involving secretion from bacterial cells, the signal peptide is a 
naturally occurring signal peptide, or a functional part thereof, while in other embodiments, it 
is a synthetic peptide. Suitable signal peptides include but are not limited to sequences 
derived from Bacillus licheniformis alpha-amylase, Bacillus clausii alkaline protease, and 
Bacillus amyloliquefaciens amylase. One preferred signal sequence is the signal peptide 
derived from Cellulomonas strain 69B4, as described herein. Thus, in some particularly 
preferred embodiments, the signal peptide comprises the signal peptide from the protease 
described herein. This signal finds use in facilitating the secretion of the 69B4 protease 
and/or a heterologous DNA sequence (e.g. a second protease, such as another wild-type 
protease, a BPN' variant protease, a GG36 variant protease, a lipase, a cellulase, a 
mannanase, etc.). In some embodiments, these second enzymes are encoded by the DNA 
sequence and/or the amino acid sequences known in the art (See e.g., U.S. Pat. Nos. 
6,465,235, 6,287,839, 5,965,384, and 5,795,764; as well as WO 98/22500, WO 92/05249, 
EP 030521 6B1 , and WO 94/25576). Furthermore, it is contemplated that in some 
embodiments, the signal sequence peptide is also be operatively linked to an endogenous 
sequence to activate and secrete such endogenous encoded protease. 

The procedures used to ligate the DNA sequences coding for the present protease, 
the promoter and/or secretory signal sequence, respectively, and to insert them into suitable 
vectors containing the information necessary for replication, are well known to those skilled 
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in the art. As indicated above, in some embodiments, the nucleic acid construct is prepared 
using PCR with specific primers. 

V. Host Cells 

5 As indicated above, in some embodiments, the present invention also provides host 

cells transformed with the vectors described above. In some embodiments, the 
polynucleotide encoding the protease(s) of the present invention that is introduced into the 
host cell is homologous, while in other embodiments, the polynucleotide is heterologous to 
the host. In some embodiments, in which the polynucleotide is homologous to the host cell 

10 (e.g., additional copies of the native protease produced by the host cell are introduced), it is 
operably connected to another homologous or heterologous promoter sequence. In 
alternative embodiments, another secretory signal sequence, and/or terminator sequence 
find use in the present invention. Thus, in some embodiments, the polypeptide DNA 
sequence comprises multiple copies of a homologous polypeptide sequence, a 

15 heterologous polypeptide sequence from another organism, or synthetic polypeptide 

sequence(s). Indeed, it is not intended that the present invention be limited to any particular 
host cells and/or vectors. 

Indeed, the host cell into which the DNA construct of the present invention is 
introduced may be any cell which is capable of producing the present alkaline protease, 

20 including, but not limited to bacteria, fungi, and higher eukaryotic cells. 

Examples of bacterial host cells which find use in the present invention include, but 
are not limited to Gram-positive bacteria such as Bacillus, Streptomyces, and Thermobifida, 
for example strains of B. subtilis, B. licheniformis, B. lentus, B. brevis, B. 
stearothermophilus, B. clausii, B. amyloliquefaciens, B. coagulans, B. circulans, B>. lautus, B. 

25 megaterium, B. thuringiensis, S. griseus, S. lividans, S. coelicolor, S. avermitilis and T. 
fusca; as well as Gram-negative bacteria such as members of the Enterobacteriaceae (e.g., 
Escherichia coli). In some particularly preferred embodiments, the host cells are B. subtilis, 
B. clausii, and/or B. licheniformis. In additional preferred embodiments, the host cells are 
strains of S. lividans (e.g., TK23 and/or TK21). Any suitable method for transformation of 

30 the bacteria find use in the present invention, including but not limited to protoplast 
transformation, use of competent cells, etc., as known in the art. In some preferred 
embodiments, the method provided in U.S. Pat. No. 5,264,366 (incorporated by reference 
herein), finds used in the present invention. For S. lividans, one preferred means for 
transformation and protein expression is that described by Fernandez-Abalos etal. (See, 

35 Fernandez-Abalos etal., Microbiol., 149:1623-1632 [2003]; See also, Hopwood, etal., 
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Genetic Manipulation of Streptomyces-. Laboratory Manual, Innis [1985], both of which are 
incorporated by reference herein). Of course, the methods described in the Example herein 
find use in the present invention. 

Examples of fungal host cells which find use in the present invention include, but are 
5 not limited to Trichoderma spp. and Aspergillus spp. In some particularly preferred 
embodiments, the host cells are Trichoderma reesei and/or Aspergillus niger. In some 
embodiments, transformation and expression in Aspergillus is performed as described in 
U.S. Pat. 5,364,770, herein incorporated by reference. Of course, the methods described in 
the Example herein find use in the present invention. 

10 In some embodiments, particular promoter and signal sequences are needed to 

provide effective transformation and expression of the protease(s) of the present invention. 
Thus, in some preferred embodiments involving the use of Bacillus host cells, the aprE 
promoter is used in combination with known Bac/V/us-derived signal and other regulatory 
sequences. In some preferred embodiments involving expression in Aspergillus, the glaA 

is promoter is used. In some embodiments involving Streptomyces host cells, the glucose 
isomerase (Gl) promoter of Actinoplanes missouriensis is used, while in other embodiments, 
the A4 promoter is used. 

In some embodiments involving expression in bacteria such as E. co//, the protease 
is retained in the cytoplasm, typically as insoluble granules {i.e., inclusion bodies). 

20 However, in other embodiments, the protease is directed to the periplasmic space by a 
bacterial secretion sequence. In the former case, the cells are lysed, and the granules are 
recovered and denatured after which the protease is refolded by diluting the denaturing 
agent. In the latter case, the protease is recovered from the periplasmic space by disrupting 
the cells (e.g., by sonication or osmotic shock), to release the contents of the periplasmic 

25 space and recovering the protease. 

In preferred embodiments, the transformed host cells of the present invention are 
cultured in a suitable nutrient medium under conditions permitting the expression of the 
present protease, after which the resulting protease is recovered from the culture. The 
medium used to culture the cells comprises any conventional medium suitable for growing 

30 the host cells, such as minimal or complex media containing appropriate supplements. 
Suitable media are available from commercial suppliers or may be prepared according to 
published recipes (e.g., in catalogues of the American Type Culture Collection). In some 
embodiments, the protease produced by the cells is recovered from the culture medium by 
conventional procedures, including, but not limited to separating the host cells from the 

35 medium by centrifugation or filtration, precipitating the proteinaceous components of the 
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supernatant or filtrate by means of a salt {e.g., ammonium sulfate), chromatographic 
purification (e.g., ion exchange, gel filtration, affinity, etc.). Thus, any method suitable for 
recovering the protease(s) of the present invention will find use. Indeed, it is not intended 
that the present invention be limited to any particular purification method. 



VI. Applications for Serine Protease Enzymes 

As described in greater detail herein, the proteases of the present invention have 
important characteristics that make them very suitable for certain applications. For example, 
the proteases of the present invention have enhanced thermal stability, enhanced oxidative 

10 stability, and enhanced chelator stability, as compared to some currently used proteases. 
Thus, these proteases find use in cleaning compositions. Indeed, under certain 
wash conditions, the present proteases exhibit comparative or enhanced wash performance 
as compared with. currently used subtilisin proteases. Thus, it is contemplated that the 
cleaning and/or enzyme compositions of the present invention will be provided in a variety of 

is cleaning compositions. In some embodiments, the proteases of the present invention are 
utilized in the same manner as subtilisin.proteases (i.e., proteases currently in use). Thus, 
the present proteases find use in various cleaning compositions, as well as animal feed 
applications, leather processing (e.g., bating), protein hydrolysis, and in textile uses. The 
identified proteases also find use in personal care applications. . 

20 Thus, the proteases of the present invention find use in a number of industrial 

applications, in particular within the cleaning, disinfecting, animal feed, and textile/leather 
industries. In some embodiments, the protease(s) of the present invention are combined 
with detergents, builders, bleaching agents and other conventional ingredients to produce a 
variety of novel cleaning compositions useful in the laundry and other cleaning arts such as, 

25 for example, laundry detergents (both powdered and liquid), laundry pre-soaks, all fabric 
bleaches, automatic dishwashing detergents (both liquid and powdered), household 
cleaners, particularly bar and liquid soap applications, and drain openers. In addition, the 
protease find use in the cleaning of contact lenses, as well as other items, by contacting 
such materials with an aqueous solution of the cleaning composition. In addition these 

30 naturally occurring proteases can be used, for example in peptide hydrolysis, waste 
treatment, textile applications, medical device cleaning, biofilm removal and as fusion- 
cleavage enzymes in protein production, etc. The composition of these products is not 
critical to the present invention, as long as the protease(s) maintain their function in the 
setting used. In some embodiments, the compositions are readily prepared by combining a 

35 cleaning effective amount of the protease or an enzyme composition comprising the 
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protease enzyme preparation with the conventional components of such compositions in 
their art recognized amounts. 

A. Cleaning Compositions 

5 The cleaning composition of the present invention may be advantageously employed 

for example, in laundry applications, hard surface cleaning, automatic dishwashing 
applications, as well as cosmetic applications such as dentures, teeth, hair and skin. 
However, due to the unique advantages of increased effectiveness in lower temperature 
solutions and the superior color-safety profile, the enzymes of the present invention are 

io ideally suited for laundry applications such as the bleaching of fabrics. Furthermore, the 
enzymes of the present invention may be employed in both granular and liquid 
compositions. 

The enzymes of the present invention may also be employed in a cleaning additive 
product. A cleaning additive product including the enzymes of the present invention is 

15 ideally suited for inclusion in a wash process when additional bleaching effectiveness is 
desired. Such instances may include, but are not limited to low temperature solution 
cleaning application. The additive product may be, in its simplest form, one or more 
proteases, including ASP. Such additive may be packaged in dosage form for addition to a 
cleaning process where a source of peroxygen is employed and increased bleaching 

20 effectiveness is desired. Such single dosage form may comprise a pill, tablet, gelcap or 
other single dosage unit such as pre-measured powders or liquids. A filler or carrier 
material may be included to increase the volume of such composition. Suitable filler or 
carrier materials include, but are not limited to, various salts of sulfate, carbonate and 
silicate as well as talc, clay and the like. Filler or carrier materials for liquid compositions 

25 may be water or low molecular weight primary and secondary alcohols including polyols and 
diols. Examples of such alcohols include, but are not limited to, methanol, ethanol, propanol 
and isopropanol. The compositions may contain from about 5% to about 90% of such 
materials. Acidic fillers can be used to reduce pH. Alternatively, the cleaning additive may 
include activated peroxygen source defined below or the adjunct ingredients as fully defined 

30 below. 

The present cleaning compositions and cleaning additives require an effective 
amount of the ASP enzyme and/or variants provided herein. The required level of enzyme 
may be achieved by the addition of one or more species of the enzymes of the present 
invention. Typically the present cleaning compositions will comprise at least 0.0001 weight 
as percent, from about 0.0001 to about 1 , from about 0.001 to about 0.5, or even from about 
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0.01 to about 0.1 weight percent of at least one of the enzymes of the present invention. 

The cleaning compositions herein will typically be formulated such that, during use in 
aqueous cleaning operations, the wash water will have a pH of from about 5.0 to about 1 1 .5 
or even from about 7.5 to about 10.5. Liquid product formulations are typically formulated to 
5 have a neat pH from about 3.0 to about 9.0 or even from about 3 to about 5. Granular 
laundry products are typically formulated to have a pH from about 9 to about 1 1 . 
Techniques for controlling pH at recommended usage levels include the use of buffers, 
alkalis, acids, etc., and are well known to those skilled in the art. 

Suitable low pH cleaning compositions typically have a neat pH of from about 3 to 

10 about 5, and are typically free of surfactants that hydrolyze in such a pH environment. Such 
surfactants include sodium alkyl sulfate surfactants that comprise at least one ethylene 
oxide moiety or even from about 1 to 16 moles of ethylene oxide. Such cleaning 
compositions typically comprise a sufficient amount of a pH modifier, such as sodium 
hydroxide, monoethanolamine or hydrochloric acid, to provide such cleaning composition 

is with a neat pH i of from about 3 to about 5. Such compositions typically comprise at least one 
acid stable enzyme. Said compositions may be liquids or solids. The pH of such liquid 
compositions is measured as a neat pH. The pH of such solid compositions is measured as 
a 10% solids solution of said composition wherein the solvent is distilled water. In these 
embodiments, all pH measurements are taken at 20°C. 

20 When the serine protease(s) is/are employed in a granular composition or liquid, it 

may be desirable for the enzyme to be in the form of an encapsulated particle to protect 
such enzyme from other components of the granular composition during storage. In 
addition, encapsulation is also a means of controlling the availability of the enzyme during 
the cleaning process and may enhance performance of the enzymes provided herein. In 

25 this regard, the serine proteases of the present invention may be encapsulated with any 
encapsulating material known in the art. 

The encapsulating material typically encapsulates at least part of the catalyst for the 
enzymes of the present invention. Typically, the encapsulating material is water-soluble 
and/or water-dispersible. The encapsulating material may have a glass transition 

30 temperature (Tg) of 0°C or higher. Glass transition temperature is described in more detail 
in WO 97/1 1 151 , especially from page 6, line 25 to page 7, line 2. 

The encapsulating material is may be selected from the group consisting of 
carbohydrates, natural or synthetic gums, chitin and chitosan, cellulose and cellulose 
derivatives, silicates, phosphates, borates, polyvinyl alcohol, polyethylene glycol, paraffin 

35 waxes and combinations thereof. When the encapsulating material is a carbohydrate, it 
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may be typically selected from the group consisting of monosaccharides, oligosaccharides, 
polysaccharides, and combinations thereof. Typically, the encapsulating material is a 
starch. Suitable starches are described in EP 0 922 499; US 4,977,252; US 5,354,559 and 
US 5,935,826. 

s The encapsulating material may be a microsphere made from plastic such as 

thermoplastics, acrylonitrile, methacrylonitrile, polyacrylonitrile, polymethacrylonitrile and 
mixtures thereof; commercially available microspheres that can be used are those supplied 
by Expancel of Stockviksverken, Sweden under the trademark Expancel®, and those 
supplied by PQ Corp. of Valley Forge, Pennsylvania U.S.A. under the tradename PM 6545, 

10 PM 6550, PM 7220, PM 7228, Extendospheres®, Luxsil®, Q-cel® and Sphericel®. 

As described herein, the proteases of the present invention find particular use in the 
cleaning industry, including, but not limited to laundry and dish detergents. These 
applications place enzymes under various environmental stresses. The proteases of the 
present invention provide advantages over many currently used enzymes, due to their 

15 stability under various conditions. 

Indeed, there are a variety of wash conditions including varying detergent 
formulations, wash water volumes, wash water temperatures, and lengths of wash time, to 
which proteases involved in washing are exposed. In addition, detergent formulations used 
in different geographical areas have different concentrations of their relevant components 

20 present in the wash water. For example, a European detergent typically has about 4500- 
5000 ppm of detergent components in the wash water, while a Japanese detergent typically 
has approximately 667 ppm of detergent components in the wash water. In North America, 
particularly the United States, detergents typically have about 975 ppm of detergent 
components present in the wash water. 

25 A low detergent concentration system includes detergents where less than about 800 

ppm of detergent components are present in the wash water. Japanese detergents are 
typically considered low detergent concentration system as they have approximately 667 
ppm of detergent components present in the wash water. 

A medium detergent concentration includes detergents where between about 800 

30 ppm and about 2000ppm of detergent components are present in the wash water. North 
American detergents are generally considered to be medium detergent concentration 
systems as they have approximately 975 ppm of detergent components present in the wash 
water. Brazil typically has approximately 1500 ppm of detergent components present in the 
wash water. 

35 A high detergent concentration system includes detergents where greater than about 



WO 2005/052146 



# 

PCT/US2004/039066 



-84- 

2000 ppm of detergent components are present in the wash water. European detergents 
are generally considered to be high detergent concentration systems as they have 
approximately 4500-5000 ppm of detergent components in the wash water. 

Latin American detergents are generally high suds phosphate builder detergents and 
5 the range of detergents used in Latin America can fall in both the medium and high 
detergent concentrations as they range from 1500 ppm to 6000 ppm of detergent 
components in the wash water. As mentioned above, Brazil typically has approximately 1500 
ppm of detergent components present in the wash water. However, other high suds 
phosphate builder detergent geographies, not limited to other Latin American countries, may 

10 have high detergent concentration systems up to about 6000 ppm of detergent components 
present in the wash water. 

In light of the foregoing, it is evident that concentrations of detergent compositions in 
. typical wash solutions throughout the world varies from less than about 800 ppm of 
detergent composition ("low detergent concentration geographies"), for example about 667 

is ppm in Japan, to between about 800 ppm to about 2000 ppm ("medium detergent 

concentration geographies" ), for example about 975 ppm in U.S. and about 1500 ppm in 
Brazil, to greater than about 2000 ppm ("high detergent concentration geographies"), for 
example about 4500 ppm to about 5000 ppm in Europe and about 6000 ppm in high suds 
phosphate builder geographies. 

20 The concentrations of the typical wash solutions are determined empirically. For 

example, in the U.S., a typical washing machine holds a volume of about 64.4 L of wash 
solution. Accordingly, in order to obtain a concentration of about 975 ppm of detergent 
within the wash solution about 62.79 g of detergent composition must be added to the 64.4 
L of wash solution. This amount is the typical amount measured into the wash water by the 

25 consumer using the measuring cup provided with the detergent. 

As a further example, different geographies use different wash temperatures. The 
temperature of the wash water in Japan is typically less than that used in Europe. For 
example, the temperature of the wash water in North America and Japan can be between 
10 and 30*C (e.g., about 20'C), whereas the temperature of wash water in Europe is 

30 typically between 30 and 60'C (e.g., about 40*C). 

As a further example, different geographies typically have different water hardness. 
Water hardness is usually described in terms of the grains per gallon mixed Ca 2 7Mg 2+ . 
Hardness is a measure of the amount of calcium (Ca 2+ ) and magnesium (Mg 2+ ) in the water. 
Most water in the United States is hard, but the degree of hardness varies. Moderately hard 

35 (60-120 ppm) to hard (121-181 ppm) water has 60 to 181 parts per million (parts per million 



WO 2005/052146 



PCT/US2004/039066 



-85- 

converted to grains per U.S. gallon is ppm # divided by 17.1 equals grains per gallon) of 
hardness minerals. 



Water 


Grains per gallon 


| Parts per million 


Soft 


less than 1 .0 


less than 17 


Slightly hard 


1.0 to 3.5 


17 to 60 


Moderately hard 


3.5 to 7.0 


60 to 120 


Hard 


7.0 to 10.5 


120 to 180 


Very hard 


greater than 10.5 


greater than 180 | 



European water hardness is typically greater than 10.5 (for example 10.5-20.0) 
grains per gallon mixed Ca 2+ /Mg 2+ (e.g., about 15 grains per gallon mixed Ca 2+ /Mg ?+ ). 
North American water hardness is typically greater than Japanese water hardness, but less 
than European water hardness. For example* North American water hardness can be 
between 3 to10 grains, 3-8 grains or about 6 grains. Japanese water hardness is typically 
lower than North American water hardness, usually less than 4, for example 3 grainsper 
gallon mixed Ca 2+ /Mg 2+ . 

Accordingly, in some embodiments, the present invention provides proteases that 
show surprising wash performance in at least one set of wash conditions (e.g., water 
temperature, water hardness, and/or detergent concentration). In some embodiments, the 
proteases of the present invention are comparable in wash performance to subtilisin 
proteases. In some embodiments, the proteases of the present invention exhibit enhanced 
wash performance as compared to subtilisin proteases. Thus, in some preferred 
embodiments of the present invention, the proteases provided herein exhibit enhanced 
oxidative stability, enhanced thermal stability, and/or enhanced chelator stability. 

In some preferred embodiments, the present invention provides the ASP protease, 
as well as homologues and variants fo the protease. These proteases find use in any 
applications in which it is desired to clean protein based stains from textiles or fabrics. 

In some embodiments, the cleaning compositions of the present invention are 
formulated as hand and machine laundry detergent compositions including laundry additive 
compositions, and compositions suitable for use in the pretreatment of stained fabrics, rinse- 
added fabric softener compositions, and compositions for use in general household hard 
surface cleaning operations, as well as dishwashing operations. Those in the art are 
familiar with different formulations which can be used as cleaning compositions. In 
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preferred embodiments, the. proteases of the present invention comprise comparative or 
enhanced performance in detergent compositions {i.e., as compared to other proteases). In 
some embodiments, cleaning performance is evaluated by comparing the proteases of the 
present invention with subtilisin proteases in various cleaning assays that utilize enzyme- 
5 sensitive stains such as egg, grass, blood, milk, etc., in standard methods. Indeed, those in 
the art are familiar with the spectrophotometric and other analytical methodologies used to 
assess detergent performance under standard wash cycle conditions. 

Assays that find use in the present invention include, but are not limited to those 
described in WO 99/34011, and U.S. Pat. No. 6,605,458 (See e.g., Example 3). In U.S. 

10 Pat. No. 6,605,458, at Example 3, a detergent dose of 3.0 g/l at pH10.5, wash time 15 
minutes, at 15 C, water hardness of 6 e dH, 10nM enzyme concentration in 150 ml glass 
beakers with stirring rod, 5 textile pieces (phi 2.5 cm) in 50 ml, EMPA 117 test material from 
Center for Test Materials Holland are used. The measurement of reflectance K R B on the test 
material was done at 460 nm using a Macbeth ColorEye 7000 photometer. Additional 

is methods are provided in the Examples herein. Thus, these methods also find use in the 
present invention. 

The addition of proteases of the invention to conventional cleaning compositions 
does not create any special use limitation. In other words, any temperature and pH suitable 
for the detergent is also suitable for the present compositions, as long as the pH is within 
20 the range set forth herein, and the temperature is below the described protease's denaturing 
temperature. In addition, proteases of the present invention find use in cleaning 
compositions that do not include detergents, again either alone or in combination with 
builders and stabilizers. 

When used in cleaning compositions or detergents, oxidative stability is a further 
25 consideration. Thus, in some applications, the stability is enhanced, diminished, or 
comparable to subtilisin proteases as desired for various uses. In some preferred 
embodiments, enhanced oxidative stability is desired. Some of the proteases of the 
present invention find particular use in such applications. 

When used in cleaning compositions or detergents, thermal stability is a further 
30 consideration. Thus, in some applications, the stability is enhanced, diminished, or 
comparable to subtilisin proteases as desired for various uses. In some preferred 
embodiments, enhanced thermostability is desired. Some of the proteases of the present 
invention find particular use in such applications. 

When used in cleaning compositions or detergents, chelator stability is a further 
35 consideration. Thus, in some applications, the stability is enhanced, diminished, or 
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derived from Bacillus {e.g., subtilisin, lentus, amyloliquefaciens, subtilisin Carlsberg, 
subtilisin 309, subtilisin 147 and subtilisin 168). Additional examples include those mutant 
proteases described in U.S. Pat. Nos. RE 34,606, 5,955,340, 5,700,676, 6,312,936, and 
6,482,628, all of which are incorporated herein by reference. Additional protease examples 
include, but are not limited to trypsin (e.g., of porcine or bovine origin), and the Fusarium 
protease described in WO 89/06270. Preferred commercially available protease enzymes 
include those sold under the trade names MAXATASE®, MAXACAL™, MAXAPEM™, 
OPTICLEAN®, OPTIMASE®, PROPERASE®, PURAFECT® and PURAFECT® OXP 
(Genencor), those sold under the trade names ALCALASE®, SAVINASE®, PRIMASE®, 
DURAZYM™, RELASE® and ESPERASE® (Novozymes); and those sold under the trade 
name BLAP™ (Henkel Kommanditgesellschaft auf Aktien, Duesseldorf, Germany. Various 
proteases are described in W 095/23221, WO 92/21760, and U.S. Pat. Nos. 5,801,039, 
5,340,735, 5,500,364, 5,855,625. An additional BPN' variant ("BPN'-var 1" and "BPN- 
variant 1"; as referred to herein) is described in US RE 34,606. An additional GG36-variant 
("GG36-var.1 B and "GG36-variant 1"; as referred to herein) is described in US 5,955,340 
and 5,700,676. - A further GG36-variant is described in US Patents 6,312,936 and 
6,482,628. In one aspect of the present invention, the cleaning compositions of the present 
invention comprise additional protease enzymes at a level from 0.00001 % to 10% of 
additional protease by weight of the composition and 99.999% to 90.0% of cleaning adjunct 
materials by weight of composition. In other embodiments of the present invention, the 
cleaning compositions of the present invention also comprise, proteases at a level of 0.0001 
% to 10%, 0.001% to 5%, 0.001% to 2%, 0.005% to 0.5% 69B4 protease (or its homologues 
or variants) by weight of the composition and the balance of the cleaning composition (e.g., 
99.9999% to 90.0%, 99.999 % to 98%, 99.995% to 99.5% by weight) comprising cleaning 
adjunct materials. 

In addition, any lipase suitable for use in alkaline solutions finds use in the present 
invention. Suitable lipases include, but are not limited to those of bacterial or fungal origin. 
Chemically or genetically modified mutants are encompassed by the present invention. 
Examples of useful lipases include Humicola lanuginosa lipase (See e.g., EP 258 068, and 
EP 305 216), Rhizomucor miehei lipase (See e.g., EP 238 023), Candida lipase, such as C. 
antarctica lipase (e.g., the C. antarctica lipase A or B; See e.g., EP 214 761), a 
Pseudomonas lipase such as P. alcaligenes and P. pseudoalcaligenes lipase (See e.g., EP 
218 272), P. cepacia lipase (See e.g., EP 331 376), P. stutzeri lipase (See e.g., GB 
1,372,034), P. fluorescens lipase, Bacillus lipase (e.g., B. subtilis lipase [Dartois etal., 
Biochem. Biophys. Acta 1131:253-260 [1993]); B. stearothermophilus lipase [See e.g., JP 
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64/744992]; and ft pumilus lipase [See e.g., WO 91 /1 6422]). 

Furthermore, a number of cloned lipases find use in some embodiments of the 
present invention, including but not limited to Penicillium camembertii lipase (See 
Yamaguchi etal., Gene 103:61-67 [1991]), Geotricum candidum lipase (See, Schimada et 
. a/., J. Biochem., 106:383-388 [1989]), and various Rhizopus lipases such as ft de/emar 
l-pase (See. Hass « a/., Gene 109:117-113 [1991]), a ft n/Veus lipase (Kugimiya etal. 
Biosc. B.otech. Biochem. 56:71 6-71 9 [1 992]) and ft oryzae lipase. 

Other types of lipolytic enzymes such as cutinases also find use in some 
embodiments of the present invention, including but not limited to the cutinase derived from 
Pseudomonas mendocina(See, WO 88/09367), or cutinase derived from Fusarium solani 
pisi (See, WO 90/09446). 

Additional suitable lipases include commercially available lipases such as M1 
LIPASE™, LUMA FAST™, and LIPOMAX™ (Genencor); LIPOLASE® and LIPOLASE® 
ULTRA (Novozymes); and LIPASE P™ -Amano' (Amano Pharmaceutical Co. Ltd , Japan) 

In some embodiments of the present invention, the cleaning compositions of the 
present invention further comprise lipases at a level from 0.00001 % to 10% of additional 
lipase by weight of the composition and the balance of cleaning adjunct materials by weight 
of composition. In other aspects of the present invention, the cleaning compositions of the 
present .nvent.on also comprise, lipases at a level of 0.0001 % to 10%, 0 001% to 5% 
0.001% to 2%, 0.005% to 0.5% lipase by weight of the composition. 

Any amylase (alpha and/or beta) surtable for use in alkaline solutions also find use in 
some embodiments of the present invention. Suitable amylases include, but are not limited rr 
to those of bacterial or fungal origin. Chemically or genetically modified mutants are fl 
■ncluded in some embodiments. Amylases that find use in the present invention, include ^ 
but are not limited to a-amylases obtained from ft licheniformis (See e.g., GB 1 296 839) $ 
Commercially available amylases that find use in the present invention include, but are not ^ 
tortled to DURAMYL®, TERMAMYL®, FUNGAMYL® and BAN™ 1 (Novozymes) and * S> 
RAPIDASE® and MAXAMYL®P (Genencor International). r 
In some embodiments of the present invention, the cleaning compositions of the o 
present .nvention further comprise amylases at a level from 0.00001 % to 10% of additional O 
amylase by weight of the composition and the balance of cleaning adjunct materials by 
we.ght of composition. In other aspects of the present invention, the cleaning compositions 
of the present invention also comprise, amylases at a level of 0.0001 % to 10% 0 001% to 
5%, 0.001 % to 2%, 0.005% to 0.5% amylase by weight of the composition. 

Any cellulase suitable for use in alkaline solutions find use in embodiments of the 
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present invention. Suitable cellulases include, but are not limited to those of bacterial or 
fungal origin. Chemically or genetically modified mutants are included in some 
embodiments. Suitable cellulases include, but are not limited to Humicola insolens 
cellulases (See e.g., U.S. Pat. No. 4,435,307). Especially suitable cellulases are the 
cellulases having color care benefits (See e.g., EP 0 495 257). 

Commercially available cellulases that find use in the present include, but are not 
limited to CELLUZYME® (Novozymes), and KAC-500(B)™ (Kao Corporation). In some 
embodiments, cellulases are incorporated as portions or fragments of mature wild-type or 
variant cellulases, wherein a portion of the N-terminus is deleted (See e.g., U.S. Pat. No. 
5,874,276). 

In some embodiments, the cleaning compositions of the present invention can 
further comprise cellulases at a level from 0.00001 % to 10% of additional cellulase by 
weight of the composition and the balance of cleaning adjunct materials by weight of 
composition. In other aspects of the present invention, the cleaning compositions of the 
present invention also comprise cellulases at a level of 0.0001 % to 10%, 0.001% to 5%, 
0.001 % to 2%, 0.005% to 0.5% celiulase by weight of the composition. 

Any mannanase suitable for use in detergent compositions and or alkaline solutions 
find use in the present invention. Suitable mannanases include, but are not limited to those 
of bacterial or fungal origin. Chemically or genetically modified mutants are included in some 
embodiments. Various mannanases are known which find use in the present invention (See 
e.g., U.S. Pat. No. 6,566,114, U.S. Pat. No.6,602,842, and US Patent No. 6,440,991, all of 
which are incorporated herein by reference). 

In some embodiments, the cleaning compositions of the present invention can 
further comprise mannanases at a level from 0.00001 % to 10% of additional mannanase by 
weight of the composition and the balance of cleaning adjunct materials by weight of 
composition. In other aspects of the present invention, the cleaning compositions of the 
present invention also comprise, mannanases at a level of 0.0001 % to 10%, 0.001% to 5%, 
0.001 % to 2%, 0.005% to 0.5% mannanase by weight of the composition. 

In some embodiments, peroxidases are used in combination with hydrogen peroxide 
or a source thereof (e.g., a percarbonate, perborate or persulfate). In alternative 
embodiments, oxidases are used in combination with oxygen. Both types of enzymes are 
used for "solution bleaching" (/.e. f to prevent transfer of a textile dye from a dyed fabric to 
another fabric when the fabrics are washed together in a wash liquor), preferably together 
with an enhancing agent (See e.g., WO 94/12621 and WO 95/01426). Suitable 
peroxidases/oxidases include, but are not limited to those of plant, bacterial or fungal origin. 
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Chemically or genetically modified mutants are included in some embodiments. 

In some embodiments, the cleaning compositions of the present invention can 
further comprise peroxidase and/or oxidase enzymes at a level from 0.00001 % to 10% of 
additional peroxidase and/or oxidase by weight of the composition and the balance of 
5 cleaning adjunct materials by weight of composition. In other aspects of the present 
invention, the cleaning compositions of the present invention also comprise, peroxidase 
and/or oxidase enzymes at a level of 0.0001 % to 10%, 0.001% to 5%, 0.001% to 2%, 
0.005% to 0.5% peroxidase and/or oxidase enzymes by weight of the composition. 

Mixtures of the above mentioned enzymes are encompassed herein, in particular a 
10 mixture of a the 69B4 enzyme, one or more additional proteases, at least one amylase, at 
least one lipase, at least one mannanase, and/or at least one cellulase. Indeed, it is 
contemplated that various mixtures of these enzymes will find use in the present invention. 

It is contemplated that the varying levels of the protease and one or more additional 
enzymes may, both independently range to 10%, the balance of the cleaning composition 
is being cleaning adjunct materials. The specific selection of cleaning adjunct materials are 
readily made by considering the surface, item, or fabric to be cleaned, and the desired form 
of the composition for the cleaning conditions during use (e.g., through the wash detergent 
use). 

Examples of suitable cleaning adjunct materials include, but are not limited to, 

20 surfactants, builders, bleaches, bleach activators, bleach catalysts, other enzymes, enzyme 
stabilizing systems, chelants, optical brighteners, soil release polymers, dye transfer agents, 
dispersants, suds suppressors, dyes, perfumes, colorants, filler salts, hydrotropes, 
photoactivators, fluorescers, fabric conditioners, hydrolyzable surfactants, preservatives, 
anti-oxidants, anti-shrinkage agents, anti-wrinkle agents, germicides, fungicides, color 

25 speckles, silvercare, anti-tarnish and/or anti-corrosion agents, alkalinity sources, solubilizing 
agents, carriers, processing aids, pigments, and pH control agents (See e.g., U.S. Pat. Nos. 
6,610,642, 6,605,458, 5,705,464, 5,710,115, 5,698,504, 5,695,679, 5,686,014 and 
5,646,101, all of which are incorporated herein by reference). Embodiments of specific 
cleaning composition materials are exemplified in detail below. 

30 If the cleaning adjunct materials are not compatible with the proteases of the present 

invention in the cleaning compositions, then suitable methods of keeping the cleaning 
adjunct materials and the protease(s) separated (i.e., not in contact with each other) until 
combination of the two components is appropriate are used. Such separation methods 
include any suitable method known in the art (e.g., gelcaps, encapulation, tablets, physical 

35 separation, etc.). 
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Preferably an effective amount of one or more protease(s) provided herein are 
included in compositions useful for cleaning a variety of surfaces in need of proteinaceous 
stain removal. Such cleaning compositions include cleaning compositions for such 
applications as cleaning hard surfaces, fabrics, and dishes. Indeed, in some embodiments, 
s the present invention provides fabric cleaning compositions, while in other embodiments, the 
present invention provides non-fabric cleaning compositions. Notably, the present invention 
also provides cleaning compositions suitable for personal care, including oral care (including 
dentrifices, toothpastes, mouthwashes, etc., as well as denture cleaning compositions), skin, 
and hair cleaning compositions. It is intended that the present invention encompass . 
10 detergent compositions in any form (i.e., liquid, granular, bar, semi-solid, gels, emulsions, 
tablets, capsules, etc.). 

By way of example, several cleaning compositions wherein the protease of the 
present invention find use are described in greater detail below. In embodiments in which 
the cleaning compositions of the present invention are formulated as compositions suitable • 
is for use in laundry machine washing method(s), the compositions of the present invention 
preferably contain at least one surfactant and at least one builder compound, as well as one 
or more cleaning adjunct materials preferably selected from organic polymeric compounds, 
bleaching agents, additional enzymes, suds suppressors, dispersants, lime-soap 
dispersants, soil suspension and anti-redeposition agents and corrosion inhibitors. In some 
20 embodiments, laundry compositions also contain softening agents (i.e., as additional 
cleaning adjunct materials). 

The compositions of the present invention also find use detergent additive products 
in solid or liquid form. Such additive products are intended to supplement and/or boost the 
performance of conventional detergent compositions and can be added at any stage of the 
25 cleaning process. 

In embodiments formulated as compositions for use in manual dishwashing 
methods, the compositions of the invention preferably contain at least one surfactant and 
preferably at least one additional cleaning adjunct material selected from organic polymeric 
compounds, suds enhancing agents, group II metal ions, solvents, hydrotropes and 
30 additional enzymes. 

In some embodiments, the density of the laundry detergent compositions herein 
ranges from 400 to 1200 g/liter, while in other embodiments, it ranges from 500 to 950 g/liter 
of composition measured at 20°C. 

In some embodiments, various cleaning compositions such as those provided in U.S, 
35 Pat. No. 6,605,458 find use with the proteases of the present invention. Thus, in some 
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embodiments, the compositions comprising at least one protease of the present invention is 
a compact granular fabric cleaning composition, while in other embodiments, the 
composition is a granular fabric cleaning composition useful in the laundering of colored 
fabrics, in further embodiments, the composition is a granular fabric cleaning composition 
which provides softening through the wash capacity, in additional embodiments, the 
composition is a heavy duty liquid fabric cleaning composition. 

In some embodiments, the compositions comprising at least one protease of the 
present invention are fabric cleaning compositions such as those described in U.S. Pat. 
Nos. 6,610,642 and 6,376,450. In addition, the proteases of the present invention find use 
in granular laundry detergent compositions of particular utility under European or Japanese 
washing conditions (See e.g., U.S. Pat. No. 6,610,642). 

In alternative embodiments, the present invention provides hard surface cleaning 
compositions comprising at least one protease provided herein. Thus, in some 
embodiments, the compositions comprising at least one protease of the present invention is 
a hard surface cleaning composition such as those described in U.S. Pat. Nos. 6,610,642, 
6,376,450, and 6,376,450. 

In yet further embodiments, the present invention provides dishwashing 
compositions comprising at least one protease provided herein. Thus, in some 
embodiments, the compositions comprising at least one protease of the present invention is 
a hard surface cleaning composition such as those in U.S. Pat. Nos. 6,610,642 and 
6,376,450. 

In still further embodiments, the present invention provides dishwashing 
compositions comprising at least one protease provided herein. Thus, in some 
embodiments, the compositions comprising at least one protease of the present invention 
comprise oral care compositions such as those in U.S. Pat. No. 6,376,450, and 6,376,450. 

The formulations and descriptions of the compounds and cleaning adjunct materials 
contained in the aforementioned US Pat. Nos. 6,376,450, 6,605,458, 6,605,458, and 
6,610,642, all of which are expressly incorporated by reference herein. Still further 
examples are set forth in the Examples below. 

I) Processes of Making and Using the Cleaning Composition of the 
Present Invention 

The cleaning compositions of the present invention can be formulated into any 
suitable form and prepared by any process chosen by the formulator, non-limiting examples 
of which are described in U.S. Pat. Nos. 5,879,584, 5,691,297, 5,574,005, 5,569,645, 
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5,565,422, 5,516,448, 5,489,392, and 5,486,303, all of which are incorporated herein by 
reference. When a low pH cleaning composition is desired, the pH of such composition may 
be adjusted via the addition of a material such as monoethanolamine or an acidic material 
such as HCI. 

5 

II) Adjunct Materials In Addition to the Serine Proteases of the Present 
Invention 

While not essential for the purposes of the present invention, the non-limiting list of 
adjuncts illustrated hereinafter are suitable for use in the instant cleaning compositions and 

10 may be desirably incorporated in certain embodiments of the invention, for example to assist 
or enhance cleaning performance, for treatment of the substrate to be cleaned, or to modify 
the aesthetics of the cleaning composition as is the case with perfumes, colorants, dyes or 
the like. It is understood that such adjuncts are in addition to the serine proteases of the 
present invention. The precise nature of these additional components, and levels of 

15 incorporation thereof, will depend on the physical form of the composition and the. nature of 
the cleaning operation for which it is to be. used. Suitable adjunct materials include, but are 
not limited to, surfactants, builders, chelating agents, dye transfer inhibiting agents, 
deposition aids, dispersants, additional enzymes, and enzyme stabilizers, catalytic materials, 
bleach actjvators, bleach boosters, hydrogen peroxide, sources of hydrogen. peroxide, 

20 preformed peracids, polymeric dispersing agents, clay soil removal/anti-redeposition agents, 
brighteners, suds suppressors, dyes, perfumes, structure elasticizing agents, fabric 
softeners, carriers, hydrotropes, processing aids and/or pigments. In addition to the 
disclosure below, suitable examples of such other adjuncts and levels of use are found in 
U.S. Patent Nos. 5,576,282, 6,306,812, and 6,326,348, that are incorporated by reference. 

25 The aforementioned adjunct ingredients may constitute the balance of the cleaning 
compositions of the present invention. 

Surfactants - The cleaning compositions according to the present invention may 
comprise a surfactant or surfactant system wherein the surfactant can be selected from 
nonionic surfactants, anionic surfactants, cationic surfactants, ampholytic surfactants, 

30 zwitterionic surfactants, semi-polar nonionic surfactants and mixtures thereof. When a low 
pH cleaning composition, such as composition having a neat pH of from about 3 to about 5, 
is desired, such composition typically does not contain alkyl ethoxylated sulfate as it is 
believed that such surfactant may be hydrolyzed by such compositions the acidic contents. 
The surfactant is typically present at a level of from about 0.1% to about 60%, from 

35 about 1 % to about 50% or even from about 5% to about 40% by weight of the subject 
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cleaning composition. 

Builders - The cleaning compositions of the present invention may comprise one or 
more detergent builders or builder systems. When a builder is used, the subject cleaning 
composition will typically comprise at least about 1%, from about 3% to about 60% or even 

s from about 5% to about 40% builder by weight of the subject cleaning composition. 
Builders include, but are not limited to, the alkali metal, ammonium and 
alkanolammonium salts of polyphosphates, alkali metal silicates, alkaline earth and alkali 
metal carbonates, aluminosilicate builders polycarboxylate compounds, ether 
hydroxypolycarboxylates, copolymers of maleic anhydride with ethylene or vinyl methyl 

10 ether, 1 , 3, 5-trihydroxy benzene-2, 4, 6-trisulphonic acid, and carboxymethyloxysuccinic 
acid, the various alkali metal, ammonium and substituted ammonium salts of polyacetic 
acids such as ethylenediamine tetraacetic acid and nitrilotriacetic acid, as well as 
polycarboxylates such as mellitic acid, succinic acid, citric acid, oxydisuccinic acid, 
polymaleic acid, benzene 1 ,3,5-tricarboxylic acid, carboxymethyloxysuccinic acid,- and 

15 soluble salts thereof. 

Chelating Agents - The cleaning compositions herein may contain a chelating agent, 
Suitable chelating agents include copper, iron and/or manganese chelating agents and 
mixtures thereof. 

When a chelating agent is used, the cleaning composition may comprise from about 
20 0.1% to about 15% or even from about 3.0% to about 10% chelating agent by weight of the 
subject cleaning composition. 

Deposition Aid - The cleaning compositions herein may contain a deposition aid. 
Suitable deposition aids include, polyethylene glycol, polypropylene glycol, polycarboxylate, 
soil release polymers such as polytelephthalic acid, clays such as Kaolinite, mpntmoriilonite, 
25 atapulgite, illite, bentonite, halloysite, and mixtures thereof. 

Dve Transfer Inhibiting Agents - The cleaning compositions of the present invention 
may also include one or more dye transfer inhibiting agents. Suitable polymeric dye transfer 
inhibiting agents include, but are not limited to, polyvinylpyrrolidone polymers, polyamine N- 
oxide polymers, copolymers of N-vinylpyrrolidone and N-vinylimidazole f 
30 polyvinyloxazolidones and polyvinylimidazoles or mixtures thereof. 

When present in a subject cleaning composition, the dye transfer inhibiting agents 
may be present at levels from about 0.0001% to about 10%, from about 0.01% to about 5% 
or even from about 0.1% to about 3% by weight of the cleaning composition. 

Dispersants - The cleaning compositions of the present invention can also contain 
35 dispersants. Suitable water-soluble organic materials include the homo- or co-polymeric 
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acids or their salts, in which the polycarboxylic acid comprises at least two carboxyl radicals 
separated from each other by not more than two carbon atoms. 

Enzymes - The cleaning compositions can comprise one or more detergent enzymes 
which provide cleaning performance and/or fabric care benefits. Examples of suitable 
5 enzymes include, but are not limited to, hemicellulases, peroxidases, proteases, cellulases, 
xylanases, lipases, phospholipases, esterases, cutinases, pectinases, keratinases, 
reductases, oxidases, phenol oxidases, lipoxygenases, ligninases, pullulanases, tannases, 
pentosanases, malanases, 13-glucanases, arabinosidases, hyaluronidase, chondroitinase, 
laccase, and amylases, or mixtures thereof. A typical combination is cocktail of 

10 conventional applicable enzymes like protease, lipase, cutinase and/or cellulase in 
conjunction with amylase. 

Enzyme Stabilizers - Enzymes for use in detergents can be stabilized by various 
techniques. The enzymes employed herein can be stabilized by the presence of water- 
soluble sources of calcium and/or magnesium ions in the finished compositions that provide 

is such ions to the enzymes. 

Catalytic Metal Complexes - The cleaning compositions of the present invention maiy 
include catalytic metal complexes. One type of metal-containing bleach catalyst is a catalyst 
system comprising a transition metal cation of defined bleach catalytic activity, such as 
copper, iron, titanium, ruthenium, tungsten, molybdenum, or manganese cations, an 

20 auxiliary metal cation having little or no bleach catalytic activity, such as zinc or aluminum 
cations, and a sequestrate having defined stability constants for the catalytic and auxiliary 
metal cations, particularly ethylenediaminetetraacetic acid, ethylenediaminetetra 
(methylenephosphonic acid) and water-soluble salts thereof. Such catalysts are disclosed in 
U.S. Pat. No. 4,430,243. 

25 If desired, the compositions herein can be catalyzed by means of a manganese 

compound. Such compounds and levels of use are well known in the art and include, for 
example, the manganese-based catalysts disclosed in U.S. Pat. No. 5,576,282. 

Cobalt bleach catalysts useful herein are known, and are described, for example, in 
U.S. Pat. Nos. 5,597,936, and 5,595,967. Such cobalt catalysts are readily prepared by 

30 known procedures, such as taught for example in U.S. Pat. Nos. 5,597,936, and 5,595,967. 
Compositions herein may also suitably include a transition metal complex of a 
macropolycyclic rigid ligand - abbreviated as "MRL". As a practical matter, and not by way 
of limitation, the compositions and cleaning processes herein can be adjusted to provide on 
the order of at least one part per hundred million of the active MRL species in the aqueous 

35 washing medium, and will preferably provide from about 0.005 ppm to about 25 ppm, more 
preferably from about 0.05 ppm to about 10 ppm, and most preferably from about 0.1 ppm 
to about 5 ppm, of the MRL in the wash liquor. 
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Preferred transition-metals in the instant transition-metal bleach catalyst include 
manganese, iron and chromium. Preferred MRUs herein are a special type of ultra-rigid 
ligand that is cross-bridged such as 5,12-diethyl-1,5,8,12-tetraazabicyclo[6.6.2]hexadecane. 

Suitable transition metal MRLs are readily prepared by known procedures, such as 
5 taught for example in WO 00/332601 , and U.S. Pat. No. 6,225,464. 

III) Processes of Making and Using Cleaning Compositions 

The cleaning compositions of the present invention can be formulated into any 
suitable form and prepared by any process chosen by the formulator, non-limiting examples 
10 of which are described in U.S. Pat. Nos. 5,879,584, 5,691,297, 5,574,005, 5,569,645, 
5,516,448, 5,489,392, and 5,486,303, all of which are incorporated herein by reference. 

IV) Method of Use 

The cleaning compositions disclosed herein of can be used to clean a situs inter alia 
15 a surface or fabric. Typically at least a portion of the situs is contacted with an embodiment 
of the present cleaning composition, in neat form or diluted in a wash liquor, and then the 
situs is optionally washed and/or rinsed. For purposes of the present invention, washing 
includes but is not limited to, scrubbing, and mechanical agitation. The fabric may comprise 
most any fabric capable of being laundered in normal consumer use conditions. The 
20 disclosed cleaning compositions are typically employed at concentrations of from about 500 
ppm to about 15,000 ppm in solution. When the wash solvent is water, the water 
temperature typically ranges from about 5°C to about 90 B C and, when the situs comprises a 
fabric, the water to fabric mass ratio is typically from about 1 :1 to about 30:1 . 

25 

B. Animal Feed 

Still further, the present invention provides compositions and methods for the 
production of a food or animal feed, characterized in that protease according to the 
invention is mixed with food or animal feed. In some embodiments, the protease is added 

30 as a dry product before processing, while in other embodiments it is added as a liquid before 
or after processing. In some embodiments, in which a dry powder is used, the enzyme is 
diluted as a liquid onto a dry carrier such as milled grain. The proteases of the present 
invention find use as components of animal feeds and/or additives such as those described 
U.S. Pat. No. 5,612,055, U.S. Pat. No. 5,314,692. and U.S. Pat No. 5,147,642, all of which 

35 are hereby incorporated by reference. 

The enzyme feed additive according to the present invention is suitable for 
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preparation in a number of methods. For example, in some embodiments, it is prepared 
simply by mixing different enzymes having the appropriate activities to produce an enzyme 
mix. In some embodiments; this enzyme mix is mixed directly with a feed, while in other 
embodiments, it is impregnated onto a cereal-based carrier material such as milled wheat, 
5 maize or soya flour. The present invention also encompasses these impregnated carriers, 
as they find use as enzyme feed additives. 

In some alternative embodiments, a cereal-based carrier (e.g., milled wheat or 
maize) is impregnated either simultaneously or sequentially with enzymes having the 
appropriate activities. For example, in some embodiments, a milled wheat carrier is first 

10 sprayed with a xylanase, secondly with a protease, and optionally with a (3-glucanase. The 
present invention also encompasses these impregnated carriers, as they find use as 
enzyme feed additives. In preferred embodiments, these impregnated carriers comprise at 
least one protease of the present invention. 

In some embodiments, the feed additive of the present invention is directly mixed 

15 with the animal feed, while in alternative embodiments, it is mixed with one or more other 
feed additives such as a vitamin feed additive, a mineral feed additive, and/or an amino acid 
feed additive. The resulting feed additive including several different types of components is 
then mixed in an appropriate, amount with the feed. 

In some preferred embodiments, the feed additive of the present invention, including 

20 cereal-based carriers is normally mixed in amounts of 0.01-50 g per kilogriam of feed, more 
preferably 0.1-10 g/kilogram, and most preferably about 1 g/kilogram. 

In alternative embodiments, the enzyme feed additive of the present invention 
involves construction of recombinant microorganisms that produces the desired enzyme(s) 
in the desired relative amounts. In some embodiments, this is accomplished by increasing 

25 the copy number of the gene encoding at least one protease of the present invention, and/or 
by using a suitably strong promoter operatively linked to the polynucleotide encoding the 
protease(s). In further embodiments, the recombinant microorganism strain has certain 
enzyme activities deleted (e.g., cellulases, endoglucanases, etc.), as desired. 

In additional embodiments, the enzyme feed additives provided by the present 

30 invention also include other enzymes, including but not limited to at least one xylanase, a- 
amylase, glucoamylase, pectinase, mannanase, a-galactosidase, phytase, and/or lipase. In 
some embodiments, the enzymes having the desired activities are mixed with the xylanase 
and protease either before impregnating these on a cereal-based carrier or alternatively 
such enzymes are impregnated simultaneously or sequentially on such a cereal-based 

35 carrier. The carrier is then in turn mixed with a cereal-based feed to prepare the final feed. 
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In alternative embodiments, the enzyme feed additive is formulated as a solution of the 
individual enzyme activities and then mixed with a feed material pre-formed as pellets or as 
a mash. 

In still further embodiments, the enzyme feed additive is included in animals' diets by 
s incorporating it into a second (i.e., different) feed or the animals' drinking water. 

Accordingly, it is not essential that the enzyme mix provided by the present invention be 
incorporated into the cereal-based feed itself, although such incorporation forms a 
particularly preferred embodiment of the present invention. The ratio of the units of 
xylanase activity per g of the feed additive to the units of protease activity per g of the feed 
10 additive is preferably 1:0.001-1,000, more preferably 1:0.01-100, and most preferably 1:0.1- 
10. As indicated above, the enzyme mix provided by the present invention is preferably 
finds use as a feed additive in the preparation of a cereal-based feed. 

In some embodiments, the cereal-based feed comprises at least 25% by weight, or 
more preferably at least 35% by weight, wheat or maize or a combination of both of these 
is cereals. The feed further comprises a protease (i.e., at least one protease of the present 
invention) in such an amount that the feed includes a protease in such an amount that the 
feed includes 100-100,000 units of protease activity per kg. 

Cereal-based feeds provided the present invention according to the present 
invention find use as feed for a variety of non-human animals, including poultry (e.g., 
20 turkeys, geese, ducks, chickens, etc.), livestock (e.g., pigs, sheep, cattle, goats, etc.), and 
companion animals (e.g., horses, dogs, cats, rabbits, mice, etc.). The feeds are particularly 
suitable for poultry and pigs, and in particular broiler chickens. 

C. Textile and Leather Treatment 

25 The present invention also provides compositions for the treatment of textiles that 
include at least one of the proteases of the present invention. In some embodiments, at 
least one protease of the present invention is a component of compositions suitable for the 
treatment of silk or wool (See e.g., U.S. RE Pat. No. 216,034, EP 134,267, U.S. Pat. No. 
4,533,359, and EP 344,259). 

30 In addition, the proteases of the present invention find use in a variety of applications 

where it is desirable to separate phosphorous from phytate. Accordingly, the present 
invention also provides methods producing wool or animal hair material with improved 
properties. In some preferred embodiments, these methods comprise the steps of 
pretreating wool, wool fibres or animal hair material in a process selected from the group 

35 consisting of plasma treatment processes and the Delhey process; and subjecting the 
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pretreated wool or animal hair material to a treatment with a proteolytic enzyme (e.g., at 
least one protease of the present invention) in an amount effective for improving the 
properties. In some embodiments, the proteolytic enzyme treatment occurs prior to the 
plasma treatment, while in other embodiments, it occurs after the plasma treatment. In 
5 some further embodiments, it is conducted as a separate step, while in other embodiments, 
it is conducted in combination with the scouring or the dyeing of the wool or animal hair 
material. In additional embodiments, at least one surfactant and/or at least one softener is 
present during the enzyme treatment step, while in other embodiments, the surfactant(s) 
and/or softener(s) are incorporated in a separate step wherein the wool or animal hair 

10 material is subjected to a softening treatment. 

In some embodiments, the compositions of the present invention find us in methods 
for shrink-proofing wool fibers (See e.g., JP 4-327274). In some embodiments, the 
compositions are used in methods for shrink-proofing treatment of wool fibers by subjecting 
the fibers to a low-temperature plasma treatment, followed by treatment with a shrink- 

15 proofing resin such as a block-urethane resin, polyamide epochlorohydrin resin, glyoxalic 
resin, ethylene-urea resin or acrylate resin, and then treatment with a weight reducing 
proteolytic enzyme for obtaining a softening effect). In some embodiments, the plasma 
treatment step is a low-temperature treatment, preferably a corona discharge treatment or a 
glow discharge treatment. 

20 In some embodiments, the low-temperature plasma treatment is carried out by using 

a gas, preferably a gas selected from the group consisting of air, oxygen, nitrogen, 
ammonia, helium, or argon. Conventionally, air is used but it may be advantageous to use 
any of the other indicated gasses. 

Preferably, the low-temperature plasma treatment is carried out at a pressure 

25 between about 0.1 torr and 5 torr for from about 2 seconds to about 300 seconds, preferably 
for about 5 seconds to about 100 seconds, more preferably from about 5 seconds to about 
30 seconds. 

As indicated above, the present invention finds use in conjunction with methods such 
as the Delhey process {See e.g., DE-A-43 32 692). In this process, the wool is treated in an 

30 aqueous solution of hydrogen peroxide in the presence of soluble wolframate, optionally 
followed by treatment in a solution or dispersion of synthetic polymers, for improving the 
anti-felting properties of the wool. In this method, the wool is treated in an aqueous solution 
of hydrogen peroxide (0.1-35% (w/w), preferably 2-10% (w/w)), in the presence of a 2-60% 
(w/w), preferably 8-20% (w/w) of a catalyst (preferably Na 2 W0 4 ), and in the presence of a 

35 nonionic wetting agent. Preferably, the treatment is carried out at pH 8-11, and room 
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temperature. The treatment time depends on the concentrations of hydrogen peroxide and 
catalyst, but is preferably 2 minutes or less. After the oxidative treatment, the wool is rinsed 
with water. For removal of residual hydrogen peroxide, and optionally for additional 
bleaching, the wool is further treated in acidic solutions of reducing agents (e.g., sulfites, 
phosphites etc.). 

In some embodiments, the enzyme treatment step carried out for between about 1 
minute and about 120 minutes. This step is preferably carried out at a temperature of 
between about 20*C. and about 60°C. t more preferably between about 30"C. and about 
50 - C. Alternatively, the wool is soaked in or padded with an aqueous enzyme solution and 
then subjected to steaming at a conventional temperature and pressure, typically for about 
30 seconds to about 3 minutes. In some preferred embodiments, the proteolytic enzyme 
treatment is carried out in an acidic or neutral or alkaline medium which may include a 
buffer. 

In alternative embodiments, the enzyme treatment step is conducted in the presence 
of one or more conventional anionic, non-ionic {e.g.; Dobanol; Henkel AG) or cationic 
surfactants. An example of a useful nonionic surfactant is Dobanol (from Henkel AG). In 
further embodiments, the wool or animal hair material is subjected to an ultrasound 
treatment, either prior to or simultaneous with the treatment with a proteolytic enzyme. In 
some preferred embodiments, the ultrasound treatment is carried out at a temperature of 
about 50°C for about 5 minutes. In some preferred embodiments, the amount of proteolytic 
enzyme used in the enzyme treatment step is between about 0.2 w/w % and about 10 w/w 
%, based on the weight of the wool or animal hair material. In some embodiments, in order 
to the number of treatment steps, the enzyme treatment is carried out during dyeing and/or 
scouring of the wool or animal hair material, simply by adding the protease to the dyeing, 
rinsing and/or scouring bath. In some embodiments, enzyme treatment is carried out after 
the plasma treatment but in other embodiments, the two treatment steps are carried out in 
the opposite order. 

Softeners conventionally used on wool are usually cationic softeners, either organic 
cationic softeners or silicone based products, but anionic or non-ionic softeners are also 
useful. Examples of useful softeners include, but are not limited to polyethylene softeners 
and silicone softeners (i.e., dimethyl polysiloxanes (silicone oils)), H-polysiloxanes, silicone 
elastomers, aminofunctional dimethyl polysiloxanes, aminofunctional silicone elastomers, 
and epoxyfunctional dimethyl polysiloxanes, and organic cationic softeners {e.g. alkyl 
quarternary ammonium derivatives). 
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In additional embodiments, the present invention provides compositions for the 
treatment of an animal hide that includes at least one protease of the present invention. In 
some embodiments, the proteases of the present invention find use in compositions for 
treatment of animal hide, such as those described in WO 03/00865 (Insect Biotech Co., 

5 Taejeon-Si, Korea). In additional embodiments, the present invention provides methods for 
processing hides and/or skins into leather comprising enzymatic treatment of the hide or 
skin with the protease of the present invention (See e.g., WO 96/1 1285). In additional 
embodiments, the present invention provides compositions for the treatment of an animal 
skin or hide into leather that includes at least one protease of the present invention. 

10 Hides and skins are usually received in the tanneries in the form of salted or dried 

raw hides or skins. The processing of hides or skins into leather comprises several different 
process steps including the steps of soaking, unhairing and bating. These steps constitute 
the wet processing and are performed in the beamhouse. Enzymatic treatment utilizing the 
proteases of the present invention are applicable at any time during the process involved in 

15 the processing of leather. However, proteases are usually employed during the wet 
processing {i.e., during soaking, unhairing and/or bating). Thus, in some preferred 
embodiments, the enzymatic treatment with at least one of the proteases of the present 
invention occurs during the wet processing stage. 

In some embodiments, the soaking processes of the present invention are 

20 performed under conventional soaking conditions {e.g., at a pH in the range pH 6.0 - 1 1). 
In some preferred embodiments, the range is pH 7.0 -10.0. In alternative embodiments, 
the temperature is in the range of 20-30 e C, while in other embodiments it is preferably in 
the range 24-28 9 C. In yet further embodiments, the reaction time is in the range 2-24 
hours, while preferred range is 4-16 hours. In additional embodiments, tensides and/or 

25 preservatives are provided as desired. 

The second phase of the bating step usually commences with the addition of the 
bate itself. In some embodiments, the enzymatic treatment takes place during bating. In 
some preferred embodiments, the enzymatic treatment takes place during bating, after the 
deliming phasei. In some embodiments, the bating process of the presents invention is 

so performed using conventional conditions (e.g., at a pH in the range pH 6.0 -9.0). In some 
preferred embodiments, the pH range is 6.0 to 8.5. In further embodiments, the 
temperature is in the range of 20-30 3 C, while in preferred embodiments, the temperature is 
in the range of 25-28 e C. In some embodiments, the reaction time is in the range of 20-90 
minutes, while in other embodiments, it is in the range 40-80 minutes. Processes for the 
35 manufacture of leather are well known to those skilled in the art (See e.g., WO 94/069429 
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WO 90/1 121 189, U.S. Pat. No. 3,840,433, EP 505920, GB 2233665, and U.S. Pat. No. 

3,986,926, all of which are herein incorporated by reference). 

In further embodiments, the present invention provides bates comprising at least one 

protease of the present invention. A bate is an agent or an enzyme-containing preparation 
5 comprising the chemically active ingredients for use in beamhouse processes, in particular 

in the bating step of a process for the manufacture of leather. In some embodiments, the 

present invention provides bates comprising protease and suitable excipients. In some 

embodiments, agents including, but not limited to chemicals known and used in the art, e.g. 

diluents, emulgators, delimers and carriers. In some embodiments, the bate comprising at 
10 least one protease of the present invention is formulated as known in the art (See e.g. , GB- 

A2250289, WO 96/1 1 285, and EP 0784703). 

In some embodiments, the bate of the present invention contains from 0.00005 to 

0.01 g of active protease per g of bate, while in other embodiments, the bate contains from 

0.0002 to 0.004 g of active protease per g of bate, 
is Thus, the proteases of the present invention find use in numerous applications and 

settings. 



EXPERIMENTAL 

20 The present invention is described in further detail in the following Examples which 

are not in any way intended to limit the scope of the invention as claimed. The attached 
Figures are meant to be considered as integral parts of the specification and description of 
the invention. All references cited are herein specifically incorporated by reference for all 
that is described therein. The following Examples are offered to illustrate, but not to limit the 

25 claimed invention 

In the experimental disclosure which follows, the following abbreviations apply: PI 
(proteinase inhibitor), ppm (parts per million); M (molar); mM (millimolar); pM (micromolar); 
nM (nanomolar); mol (moles); mmol (millimoles); pmol (micromoles); nmol (nanomoles); gm 
(grams); mg (milligrams); pg (micrograms); pg (picograms); L (liters); ml and mL (milliliters); 

30 pi and pL (microliters); cm (centimeters); mm (millimeters); pm (micrometers); nm 
(nanc>meters); U (units); V (volts); MW (molecular weight); sec (seconds); min(s) 
(minute/minutes); h(s) and hr(s) (hour/hours); °C (degrees Centigrade); QS (quantity 
sufficient); ND (not done); NA (not applicable); rpm (revolutions per minute); H 2 0 (water); 
dH 2 0 (deionized water); (HCI (hydrochloric acid); aa (amino acid); bp (base pair); kb 

35 (kilobase pair); kD (kilodaltons); cDNA (copy or complementary DNA); DNA 
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(deoxyribpnucleic acid); ssDNA (single stranded DNA); dsDNA (double stranded DNA); 
dNTP (deoxyribonucleotide triphosphate); RNA (ribonucleic acid); MgCI 2 (magnesium 
chloride); NaCI (sodium chloride); w/v (weight to volume); v/v (volume to volume); g 
(gravity); OD (optical density); Dulbecco's phosphate buffered solution (DPBS); SOC (2% 
5 Bacto-Tryptone, 0.5% Bacto Yeast Extract, 10 mM NaCI, 2.5 mM KCI); Terrific Broth (TB; 12 
g/l Bacto Tryptone, 24 g/l glycerol, 2.31 g/l KH 2 P0 4 , and 12.54 g/l K 2 HP0 4 ); OD^o (optical 
density, at 280 nm); ODeoo (optical density at 600 nm); A405 (absorbance at 405 nm); Vmax 
(the maximum initial velocity of an enzyme catalyzed reaction); PAGE (polyacrylamide gel 
electrophoresis); PBS (phosphate buffered saline [150 mM NaCI, 10 mM sodium phosphate 

10 buffer, pH 7.2]); PBST (PBS+0.25% TWEEN® 20); PEG (polyethylene glycol); PCR 
(polymerase chain reaction); RT-PCR (reverse transcription PCR); SDS (sodium dodecyl 
sulfate); Tris (tris(hydroxymethyl)aminomethane); HEPES (N-[2-Hydroxyethyl]piperazine- 
N-[2-ethanesulfonic acid]); HBS (HEPES buffered saline); SDS (sodium dodecylsulfate); 
Tris-HCI (tris[Hydroxymethyl]aminomethane-hydrochloride); Tricine (N-[tris-(hydroxymethyl)- 

15 methyl]-glycine); CHES (2-(N-cyclo-hexylamino) ethane-sulfonic acid); TAPS (3-{[tris- 
(hydroxymethyl)-methyl]-amino}-propanesulfonic acid); CAPS (3-(cyclo-hexylamino)- 
propane-sulfonic acid; DMSO (dimethyl sulfoxide); DTT (1,4-dithio-DL-threitol); SA (sinapinic 
acid (s,5-dimethoxy-4-hydroxy cinnamic acid); TCA (trichloroacetic acid); Glut and GSH 
(reduced glutathione); GSSG (oxidized glutathione); TCEP (Tris[2-carboxyethyl] phosphine); 

20 Ci (Curies); mCi (milliCuries); pCi (microCuries); HPLC (high pressure liquid 

chromatography); RP-HPLC (reverse phase high pressure liquid chromatography); TLC 
(thin layer chromatography); MALDI-TOF (matrix-assisted laser desorption/ionization-time 
of flight); Ts (tosyl); Bn (benzyl); Ph (phenyl); Ms (mesyl); Et (ethyl), Me (methyl); Taq 
(Thermus aquaticus DNA polymerase); Klenow (DNA polymerase I large (Klenow) 

25 fragment); rpm (revolutions per minute); EGTA (ethylene glycol-bis(B-aminoethyl ether) N, 
N, N\ N'-tetraacetic acid); EDTA (ethylenediaminetetracetic acid); bla (p-lactamase or 
ampicillin-resistance gene); HDL (heavy duty liquid detergent, /.e., laundry detergent); MJ 
Research (MJ Research, Reno,NV); Baseclear (Baseclear BV, Inc., Leiden, the 
Netherlands); PerSeptive (PerSeptive Biosystems, Framingham, MA); ThermoFinnigan 

30 (ThermoFinnigan, San Jose, CA); Argo (Argo BioAnalytica, Morris Plains, NJ);Seitz EKS 
(SeitzSchenk Filtersystems GmbH, Bad Kreuznach, Germany); Pall (Pall Corp., East Hills, 
NY); Spectrum (Spectrum Laboratories, Dominguez Rancho, CA); Molecular Structure 
(Molecular Structure Corp., Woodlands, TX); Accelrys (Accelrys, Inc., San Diego, CA); 
Chemical Computing (Chemical Computing Corp., Montreal, Canada); New Brunswick (New 

35 Brunswick Scientific, Co., Edison, NJ); CFT (Center for Test Materials, Vlaardingeng, the 
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Netherlands); Procter & Gamble (Procter & Gamble, Inc., Cincinnati, OH); GE Healthcare 
(GE Healthcare, Chalfont St. Giles, United Kingdom); DNA2.0 (DNA2.0, Menlo Park, CA); 
OXOID (Oxoid, Basingstoke, Hampshire, UK); Megazyme (Megazyme International Ireland 
Ltd., Bray Business Park, Bray, Co., Wicklow, Ireland); Finnzymes (Finnzymes Oy, Espoo, 
5 Finland); Kelco (CP Kelco, Wilmington, DE); Corning (Corning Life Sciences, Corning, NY); 
(NEN (NEN Life Science Products, Boston, MA); Pharma AS (Pharma AS, Oslo, Norway); 
Dynal (Dynal, Oslo, Norway); Bio-Synthesis (Bio-Synthesis, Lewisville, TX); ATCC 
(American Type Culture Collection, Rockville, MD); Gibco/BRL (Gibco/BRL, Grand Island , 
NY); Sigma (Sigma Chemical Co., St. Louis, MO); Pharmacia (Pharmacia Biotech, 

10 Piscataway, NJ); NCBI (National Center for Biotechnology Information); Applied Biosystems 
(Applied Biosystems, Foster City, CA); BD Biosciences and/or Clontech (BD Biosciences 
CLONTECH Laboratories, Palo Alto, CA); Operon Technologies (Operon Technologies, 
Inc., Alameda, CA); MWG Biotech (MWG Biotech, High Point, NC); Oligos Etc (Oligos Etc. 
Inc, Wilsonville, OR); Bachem (Bachem Bioscience, Inc., King of Prussia, PA); Difco (Difco 

15 Laboratories, Detroit, Ml); Mediatech (Mediatech, Herndon, VA; Santa Cruz (Santa Cruz 
Biotechnology, Inc., Santa Cruz, CA); Oxoid (Oxoid Inc., Ogdensburg, NY); Worthington 
(Worthington Biochemical Corp., Freehold, NJ); GIBCO BRL or Gibco BRL (Life 
Technologies, Inc., Gaithersburg, MD); Millipore (Millipore, Billerica, MA); Bio-Rad (Bio-Rad, 
Hercules, CA); Invitrogen (Invitrogen Corp., San Diego, CA); NEB (New England Biolabs, 

20 Beverly, MA); Sigma (Sigma Chemical Co., St. -Louis, MO); Pierce (Pierce Biotechnology, 
Rockford, IL); Takara (Takara Bio Inc., Otsu, Japan); Roche (Hoffmann-La Roche, Basel, 
Switzerland); EM Science (EM Science, Gibbstown, NJ); Qiagen (Qiagen, Inc., Valencia, 
CA); Biodesign (Biodesign Intl., Saco, Maine); Aptagen (Aptagen, Inc., Herndon, VA); 
Sorvall (Sorvall brand, from Kendro Laboratory Products, Asheville, NC); Molecular Devices 

25 (Molecular Devices, Corp., Sunnyvale, CA); R&D Systems (R&D Systems, Minneapolis, 
MN); Stratagene (Stratagene Cloning Systems, La Jolla, CA); Marsh (Marsh Biosciences, 
Rochester, NY); Bio-Tek (Bio-Tek Instruments, Winooski, VT); (Biacore (Biacore, Inc., 
Piscataway, NJ); PeproTech (PeproTech, Rocky Hill, NJ); SynPep (SynPep, Dublin, CA); 
New Objective (New Objective brand; Scientific Instrument Services, Inc., Ringoes, NJ); 

30 Waters (Waters, Inc., Milford, MA); Matrix Science (Matrix Science, Boston, MA); Dionex 
(Dionex, Corp., Sunnyvale, CA); Monsanto (Monsanto Co., St. Louis, MO); Wintershall 
(Wintershall AG, Kassel, Germany); BASF (BASF Co., Florham Park, NJ); Huntsman 
(Huntsman Petrochemical Corp., Salt Lake City, UT); Enichem (Enichem Iberica, Barcelona, 
Spain); Fluka Chemie AG (Fluka Chemie AG, Buchs, Switzerland); Gist-Brocades (Gist- 

35 Brocades, NV, Delft, the Netherlands); Dow Corning (Dow Corning Corp., Midland, Ml); and 
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Microsoft (Microsoft, Inc., Redmond, WA). 



EXAMPLE 1 

5 Assays 

In the following Examples, various assays were used, such as protein 
determinations, application-based tests, and stability-based tests. For ease in reading, the 
following assays are set forth below and referred to in the respective Examples. Any ' 
deviations from the protocols provided below in any of the experiments performed during the 

io development of the present invention are indicated in the Examples. 

Some of the detergents used in the following Examples had the following 
compositions. In Compositions I and II, the balance (to 100%) is perfume/dye and/or water. 
The pH of these compositions was from about 5 to about 7 for Composition I, and about 7.5 
to about 8.5 Composition II. In Composition III, the balance (to 100%) comprised of water 

is and/or the minors perfume, dye, brightener/SRPI/sodium 

carboxymethylcellulose/photobleach/MgSo4/PVPVI/suds suppressor/high molecular 
PEG/clay. 



DETERGENT COMPOSITIONS 




Composition 1 


Composition II 


LAS 


24.0 


8.0 


C12-C15 AEi.eS 




11.0 


C 8 -C 1Q propyl dimethyl amine 


2.0 


2.0 


Ci 2 -C 14 alkyl dimethyl amine oxide 






Ci2"Cis AS 




7.0 


CFAA 




4.0 


C12-C14 Fatty alcohol ethoxylate 


12.0 


1.0 


C12-C18 Fatty acid 


3.0 


4.0 


Citric acid (anhydrous) 


6.0 


3.0 


DETPMP 




1.0 


Monoethanolamihe 


5.0 


5.0 


Sodium hydroxide 




1.0 


1 N HCI aqueous solution 


#1 
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Propanediol 


12.7 


10. 


Ethanol 


1.8 


5.4 


DTPA 


0.5 


0.4 


Pectin Lyase 


- 


0.005 


Lipase 


0.1 


- 


Amylase 


0.001 


- ■ 


Cellulase 


- 


0.0002 


Protease A 


- 


- 


Aldose Oxidase 


- 


- 


DETBCHD 


- 


0.01 


SRP1 


0.5 


0.3 


Boric acid 


2.4 


2.8 


Sodium xylene sulfonate 






DC 3225C 


1.0 


1.0 


2-butyl-octanol 


0.03 


0.03 


Brightener 1 


0.12 


0.08 



Composition III 

C 14 -Ci5AS or sodium tallow alkyl 

sulfate 

LAS 

5AE3S 

C 12 -Ci 5 Es or E 3 

QAS 

Zeolite A 

SKS-6 (dry add) 

MA/M 

AA 

3Na Citrate 2H 2 0 

Citric Acid (Anhydrous) 

DTPA 

EDDS 

HEDP 

PB1 



3.0 

8.0 

I. 0 
5.0 

II. 0 
9.0 
2.0 



1.5 

0.5 
0.2 
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Composition III 

Percarbonate 3.8 
NOBS 

NACA OBS 2.0 

TAED 2.0 

BB1 0.34 
BB2 

Anhydrous Na Carbonate 8.0 

Sulfate 2.0 

Silicate 

Protease B 

Protease C 

Lipase 

Amylase 

Cellulase 

Pectin Lyase 0.001 

Aldose Oxidase 0.05 
PAAC 



A. TCA Assay for Protein Content Determination in 9 6- we 1 1 Mic rot iter Plates 

5 This assay was started using filtered culture supernatant from microtiter plates grown 

4 days at 33 °C with shaking at 230 RPM and humidified aeration. A fresh 96-well flat 
bottom plate was used for the assay. First, 100 pL/well of 0.25 N HCI were placed in the 
wells. Then, 50 pL filtered culture broth were added to the wells. The light 
scattering/absorbance at 405 nm (use 5 sec mixing mode in the plate reader) was then 

10 determined, in order to provide the "blank" reading. 

For the test, 100 pL/well 15% (w/v) TCA was placed in the plates and incubated 
between 5 and 30 min at room temperature. The light scattering/absorbance at 405 nm 
(use 5 sec mixing mode in the plate reader) was then determined. 

The calculations were performed by subtracting the blank (/.&, no TCA) from the test 

is reading with TCA. If desired, a standard curve can be created by calibrating the TCA 
readings with AAPF assays of clones with known conversion factors. However, the TCA 
results are linear with respect to protein concentration from 50 to 500 ppm and can thus be 
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plotted directly against enzyme performance for the purpose of choosing good-performing 
variants. 

B. suc-AAPF-pNA Assay of Proteases in 96-well Microtiter Plates 

In this assay system, the reagent solutions used were: 

1 . 1 00 mM Tris/HCI, pH 8.6, containing 0.005% TWEEN®-80 (Tris buffer) 

2. 100 mM Tris buffer, pH 8.6, containing 10 mM CaCI 2 and 0.005% TWEEN®-80 (Tris buffer) 

3. 160 mM suc-AAPF-pNA in DMSO (suc-AAPF-pNA stock solution) (Sigma: S-7388) 

To prepare suc-AAPF-pNA working solution, 1 ml AAPF stock was added to 100 ml 
Tris/Ca buffer and mixed well for at least 10 seconds. 

The assay was performed by adding 10 jjU of diluted protease solution to each well, 
followed by the addition (quickly) of 190 pi 1 mg/ml AAPF-working solution. The 
solutions were mixed for 5 sec, and the absorbance change was read at 410 nm in 
an MTP reader, at 25°C. The protease activity was expressed as AU (activity = 
SODmin" 1 .ml' 1 ). 

C. Keratin Hydrolysis Assay 

In this assay system, the chemical and reagent solutions used were: 

Keratin ICN 902111 

Detergent Detergent Composition II 

1 .6 g. detergent is dissolved in 1000 ml water (pH = 8.2) 

0.6 ml. CaCI2/MgCI2 of 10,000 gpg is added as well as 1190 mg 

HEPES, giving a hardness and buffer strength of 6 gpg and 5 mM 

respectively. The pH is adjusted to 8.2 with NaOH. 

Picrylsulfonic acid (TNBS) 

Sigma P-2297 (5% solution in water) 

Reagent A 45.4 g Na 2 B 4 O 7 .10 H20 (Merck 6308) and 15 ml of 4N NaOH are 

dissolved together to a final volume of 1000 ml (by heating if needed) 

Reagent B 35.2 g NaH 2 P0 4 .1 H 2 0 (Merck 6346) and 0.6 g Na 2 S0 3 (Merck 6657) 

are dissolved together to a final volume of 1000 ml. 

Method: 

Prior to the incubations, keratin was sieved on a 100 pm sieve in small portions at a 
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time. Then, 10 g of the < 100 pm keratin was stirred in detergent solution for at least 20 
minutes at room temperature with regular adjustment of the pH to 8.2. Finally, the 
suspension was centrifuged for 20 minutes at room temperature (Sorvall, GSA rotor, 13,000 
rpm). This procedure was then repeated. Finally, the wet sediment was suspended in 

5 detergent to a total volume of 200 ml., and the suspension was kept stirred during pipetting. 
Prior to incubation, microtiter plates (MTPs) were filled with 200 pi substrate per well with a 
Biohit multichannel pipette and 1200 pi tip (6 dispenses of 200 pi and dispensed as fast as 
possible to avoid settling of keratin in the tips). Then, 10pl of the filtered culture was added 
to the substrate containing MTPs. The plates were covered with tape, placed in an incubator 

10 and incubated at 20 °C for 3 hours at 350 rpm (Innova 4330 [New Brunswick]). Following 
incubation, the plates were centrifuged for 3 minutes at 3000 rpm (iSigma 6K 15 centrifuge). 
About 15 minutes before removal of the 1 st plate from the incubator, the TNBS reagent was 
prepared by mixing 1 ml TNBS solution per 50 ml of reagent A. 

MTPs were filled with 60 pi TNBS reagent A per well. From the incubated plates, 10 

15 pi was transferred to the MTPs with TNBS reagent A. The plates were covered with tape 
and shaken for 20 minutes in a bench shaker (BMG Thermostar) at room temperature and 
500 rpm. Finally, 200 pi of reagent B was added to the wells, mixed for 1 minute on a 
shaker, and the absorbance at 405 nm was measured with the MTP-reader. 

20 Calculation of the Keratin Hydrolyzing Activity: 

The obtained absorbance value was corrected for the blank value (substrate without 
enzyme). The resulting absorbance provides a measure for the hydrolytic activity. For each 
sample (variant) the performance index was calculated. The performance index compares 
the performance of the variant (actual value) and the standard enzyme (theoretical value) at 

25 the same protein concentration. In addition, the theoretical values can be calculated, using 
the parameters of the Langmuir equation of the standard enzyme. A performance index (PI) 
that is greater than 1 (Pl>1) identifies a better variant (as compared to the standard [e.g., 
wild-type]), while a PI of 1 (Pl=1) identifies a variant that performs the same as the standard, 
and a PI that is less than 1 (Pl<1) identifies a variant that performs worse than the standard. 

30 Thus, the PI identifies winners, as well as variants that are less desirable for use under 
certain circumstances. 



D. Micros watch Assay for Testing Protease Performance 

35 

All of the detergents used in these assays did not contain enzymes. 
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Detergent Preparations: 

1. European Detergent Solution: 

Milli-Q water was adjusted to 15 gpg water hardness (Ca/Mg=4/1), add 7.6 g/l 
ARIEL® Regular detergent and stir the detergent solution vigorously for at least 30 minutes. 
The detergent was filtered before use in the assay through a 0.22|jm filter {e.g. Nalgene top 
bottle filter). 

2. Japanese Detergent Solution 

Milli-Q water was adjusted to 3 gpg water hardness (Ca/Mg=3/1), add 0.66 g/l 
Detergent Composition III, the detergent solution was stirred vigorously for at least 30 
minutes. The detergent was filtered before use in the assay through a 0.22pm filter (e.g. 
Nalgene top bottle filter). 

3. Cold Water Liquid Detergent (US Conditions): 

Milli-Q water was adjusted to 6 gpg water hardness (Ca/Mg=3/1), add 1.60 g/l 
TIDE® LVJ-1 detergent and stir the detergent solution vigorously for at least 15 minutes. 
Add 5mM Hepes buffer and set pH at 8.2. The detergent was filtered before use in the 
assay through a 0.22pm filter {e.g. Nalgene top bottle filter). 

4. Low pH Liquid Detergent (US Conditions): 

Milli-Q water was adjusted to 6 gpg water hardness (Ca/Mg=3/1), 1 .60 g/l Detergent 
Composition I, was added and the detergent solution stirred vigorously for at least 15 
minutes. The pH was set at 6.0 using 1N NaOH solution. The detergent was filtered before 
use in the assay through a 0.22pm filter (e.g. Nalgene top bottle filter). 

Microswatches: 

Microswatches of Va 0 circular diameter were ordered and delivered by CFT 
Vlaardingen. The microswatches were pretreated using the fixation method described 
below. Single microswatches were placed in each well of a 96-well microtiter plate vertically 
to expose the whole surface area (i.e., not flat on the bottom of the well). 

Bleach Fixation ("Superfixed"): 
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ln a 10 L stainless steel beaker containing 10L of water, the water was heated to 
60°C for fixation of swatches used in European conditions (=Super fixed). For Japanese 
condition(s) and other conditions, the swatches were fixed at room temperature (=3K). 
Then, 10 ml of 30% hydrogen peroxide (1 ml/L of H 2 0 2j final cone, of H 2 0 2 is 300 ppm) 

5 were added. Then, 100 swatches (10 swatches/L) were added to the solution. The solution 
was allowed to sit for 30 minutes with occasional stirring and monitoring of the temperature. 
The swatches were rinsed 7-8 times with cold water and placed on bench to dry. A towel 
was placed on top of swatches, as this prevents the swatches from curling up. For the 3K 
swatches, the procedure is repeated (except the water was not heated and10x the amount 

10 of hydrogen peroxide was added). 

Alternative Fixation ("3K" Swatch Fixation): 

This particular swatch fixation was done at room temperature, however the amount 
of 30% H202 added is 10X more than in the Superfixed Swatch Fixation. Bubble formation 

is (frothing) will be visible and therefore it is necessary to use a bigger beaker to account for 
this. First, 8 liters of distilled water are placed in a 10 L beaker, and 80 ml of 30% hydrogen 
peroxide are added. The water and peroxide are mixed well with a ladle. Then, 40 pieces 
of EMPA 1 16 swatches were spread into a fan before adding into the solution to ensure 
uniform fixation. The swatches were swirled in the solution (using the ladle) for 30 minutes, 

20 continuously for the first five minutes and occasionally for the remaining 25 minutes. The 
solution was discarded and the swatches were rinsed 6 times with approximately 6 liters of 
distilled water each time. The swatches were placed on top of paper towels to dry. The air- 
dried swatches were punched using a circular die on an expulsion press. A single 
microswatch was placed vertically into each well of a 96-well microtiter plate to expose the 

25 whole surface area (i.e. not flat on the bottom of the well). 

Enzyme Samples: 

The enzyme samples were tested at appropriate concentrations for the respective 
geography, and diluted in 10 mM NaCI, 0.005% TWEEN®-80 solution. 



Test Method: 

The incubator was set at the desired temperature: 20°C for cold water liquid 
conditions; 30 # C for low-pH liquid conditions; 40*C for European conditions; 20°C for 
35 Japanese and North American conditions. The pretreated and precut swatches were placed 
into the wells of a 96-well MTP, as described above. The enzyme samples were diluted, if 



30 
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needed, in 10 mM NaCI, 0.005% TWEEN®-80 to 20x the desired concentration. The 
desired detergent solutions were prepared as described above. Then, 190 pi of detergent 
solution were added to each well of the MTP. To this mixture, 10 pi of enzyme solution were 
added to each well (to provide a total volume to 200 pl/well). The MTP was sealed with a 
5 plate sealer and placed in an incubator for 60 minutes, with agitation at 350 rpm. Following 
incubation under the appropriate conditions, 100 pi of solution from each well were removed 
and placed into a fresh MTP. The new MTP containing 100 pi of solution/well was read at 
405 nm in a MTP reader. Blank controls, as well as a control containing a microswatch and 
detergent but no enzyme were also included. 



Table 1-1 Detergent Composition and Incubation Conditions in the pSwatch Assay. 



Geography 


Reference 
Enzyme 


Detergent 


Water 
Hardness 


Enzyme 
Dosage 
[ppm] 


Temp. 


Swatch 


European 


.ASP 
GG36 


7.6 g/l 
ARIEL® 
Regular 


i5gpg- 
Ca/Mg:4/1 


0.5-4 


40° 


Superfix 


Japanese 


ASP 
GG36 


0.66 g/l 
Detergent 
Comp. Ill 


3gpg- j 
Ca/Mg:3/1 


0.5-4 


20° 


3K 


Cold Water 
Liquid 


ASP 


1.6 g/l Tide® 
LVJ-1 


6 gpg - Ca/Mg 
:3/1 


0.5-4 


20° 


3K 


Liquid 
Detergent 
Comp. I 


ASP 


1.6 g/l 
Detergent 
Comp. I 


6gpg- 

Ca/Mg:3/1 


0.5-4 


30° 


3K 



** The stock solution was used at a concentration of 15,000 gpg 
stock #1 = Ca/Mg 3:1 

(1 .92 M Ca 2+ = 282.3 g/L CaCI 2 .2H 2 0; 0.64 M Mg** = 30.1 g/L MgCI 2 .6H 2 0) 
is stock #2= Ca/Mg 4:1 

(2.05 M Ca 2+ = 301.4 g/L CaCI 2 .2H 2 0; 0.51 M Mg 2+ =103.7 g/L MgCI 2 .6H 2 0) 

Calculation of the BMI Performance: 

The obtained absorbance value was corrected for the blank value (obtained after 
20 incubation of microswatches in the absence of enzyme). The resulting absorbance was a 
measure for the hydrolytic activity. For each sample (variant) the performance index was 
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calculated. The performance index compares the performance of the variant (actual value) 
and the standard enzyme (theoretical value) at the same protein concentration. In addition, 
the theoretical values can be calculated, using the parameters of the Langmuir equation of 
the standard enzyme. A performance index (PI) that is greater than 1 (Pl>i) identifies a 
better variant (as compared to the standard [e.g., wild-type]), while a PI of 1 (Pl=1) identifies 
a variant that performs the same as the standard, and a PI that is less than 1 (Pl<1) 
identifies a variant that performs worse than the standard. 

Thus, the PI identifies winners, as well as variants that are less desirable for use under 
certain circumstances. 



Dimethvlcasein Hydrolysis Assay (96 wells) 



In this assay system, the chemical and reagent solutions used were: 



Dimethylcasein (DMC): 

TWEEN®-80: 

PIPES buffer (free acid): 



Picrylsulfonic acid (TNBS): 
Reagent A: 



Reagent B: 



Sigma C-9801 
Sigma P-8074 

Sigma P-1851; 15.1 g is dissolved in about 960 ml water; pH is 

adjusted : to 7.0 with 4N NaOH, 1 ml 5% TWEEN®- 80 is 

added and the volume brought up to 1000 ml. The final 

concentration of PIPES and TWEEN®-80 is 50 mM and 

0.005% respectively. 

Sigma P-2297 (5% solution in water) 

45.4 g Na 2 B 4 O 7 .10 H20 (Merck 6308) and 15 ml of 4N NaOH 

are dissolved together to a final volume of 1000 ml (by 

heating if heeded) 

35.2 g NaH 2 P0 4 .1H 2 0 (Merck 6346) and 0.6 g Na 2 S0 3 (Merck 
6657) are dissolved together to a final volume of 1000 ml. 



Method: 

To prepare the substrate, 4 g DMC were dissolved in 400 ml PIPES buffer. The filtered 
culture supernatants were diluted with PIPES buffer; the final concentration of the controls in 
the growth plate was 20 ppm. Then, 10 pi of each diluted supernatant were added to 200 pi 
substrate in the wells of a MTP. The MTP plate was covered with tape, shaken for a few 
seconds and placed in an oven at 37'C for 2 hours without agitation. 

About 15 minutes before removal of the 1 st plate from the oven, the TNBS reagent was 
prepared by mixing 1 ml TNBS solution per 50 ml of reagent A. MTPs were filled with 60 pi 
TNBS reagent A per well. The incubated plates were shaken for a few seconds, after which 
10 pi were transferred to the MTPs with TNBS reagent A. The plates were covered with 
tape and shaken for 20 minutes in a bench shaker (BMG Thermostar) at room temperature 
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and 500 rpm. Finally, 200 \)\ reagent B were added to the wells, mixed for 1 minute on a 
shaker, and the absorbance at 405 hm was determined using an MTP-reader. 

Calculation of Dimethylcasein Hydrolyzing Activity: 

The obtained absorbance value was corrected for the blank value (substrate without 
enzyme). The resulting absorbance is a measure for the hydrolytic activity. The (arbitrary) 
specific activity of a sample was calculated by dividing the absorbance and the determined 
protein concentration. 

E. Thermostability Assay 

This assay is based on the dimethylcasein hydrolysis, before and after heating of the 
buffered culture supernatant. The same chemical and reagent solutions were used as 
described in the dimethylcasein hydrolysis assay. 

Method: 

The filtered culture supernatants were diluted to 20 ppm in PIPES buffer (based on 
the concentration of the controls in the growth plates). Then, 50 y\ of each diluted 
supernatant were placed in the empty wells of a MTP. The MTP plate was incubated in an 
iEMS incubator/shaker HT (Thermo Labsystems) for 90 minutes at 60°C and 400 rpm. The 
plates were cooled on ice for 5 minutes. Then, 10 pi of the solution was added to a fresh 
MTP containing 200 pi dimethylcasein substrate/well. This MTP was covered with tape, 
shaken for a few seconds and placed in an oven at 37 °C for 2 hours without agitation. The 
same detection method as used for the DMC hydrolysis assay was used. 

Calculation of Thermostability: 

The residual activity of a sample was expressed as the ratio of the final absorbance 
and the initial absorbance, both corrected for blanks. 

F. LAS Stability Assay 

LAS stability was measured after incubation of the test protease in the presence of 
0.06% LAS (dodecylbenzenesulfonate sodium), and the residual activity was determined 
using the AAPF assay. 
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Reagents: 

Dodecyibenzenesulfonate, Sodium salt (=LAS): Sigma D-2525 
TWEEN®-80: Sigma P-8074 

TRIS buffer (free acid): Sigma T-1378); 6.35 g is dissolved in about 960 ml water; pH is 

adjusted to 8.2 with 4N HCI. Final concentration of TRIS is 52.5 mM. 

LAS stock solution: Prepare a 10.5 % LAS solution in MQ water (=10.5 g per 100 ml 

MQ) 

TRIS buff eM 00 mM / pH 8.6 (100mM Tris/0.005% Tween80) 
TRIS-Ca buffer, pH 8.6 (100mM Tris/10mM CaCI2/0.005% Tween80) 

Hardware: 

Flat bottom MTPs: Costar (#9017) 
Biomek FX 
ASYS Multipipettor 
Spectramax MTP Reader 
iEMS Incubator/Shaker 
Innova 4330 Incubator/Shaker 
Biohit multichannel pipette 
BMG Thermostar Shaker 



Method: 

A 10 pi 0.063% LAS solution was prepared in 52.5 mM Tris buffer pH 8.2. The 
AAPF working solution was prepared by adding 1 ml of 100 mg/ml AAPF stock solution (in 
DMSO) to 100 ml (100 mM) TRIS buffer, pH 8.6. To dilute the supernatants, flat-bottomed 
plates were filled with dilution buffer and an aliquot of the supernatant was added and 
mixed well. The dilution ratio depended on the concentration of the ASP-controls in the 
growth plates (AAPF activity). The desired protein concentration was 80 ppm. 

Ten pi of the diluted supernatant was added to 190 pi 0.063% LAS buffer/well. The 
MTP was covered with tape, shaken for a few seconds and placed in an incubator (Innova 
4230) at 25*C, for 60 minutes at 200 rpm agitation. The initial activity (<=10 minutes) was 
determined after 10 minutes of incubation by transferring 10 pi of the mixture in each well to 
a fresh MTP containing 190pl AAPF work solution. These solutions were mixed well and the 
AAPF activity was measured using a MTP Reader (20 readings in 5 minutes and 25*C). 

The final activity (f=60 minutes) was determined by removing another 10 pi of 
solution from the incubating plate after 60 minutes of incubation. The AAPF activity was 
then determined as described above. The calculations were performed as follows: 
the % Residual Activity was [f-60 valuepOO / [M0 value]. 



G. Scrambled Egg Hydrolysis Assay 

Proteases release insoluble particles from scrambled egg, which was baked into the 



• 
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wells of 96-well microtiter plates. The scrambled egg coated wells were treated with a 
mixture of protease containing culture filtrate and ADW (automatic dishwash detergent) to 
determine the enzyme performance in scrambled egg removal. The rate of turbidity is a 
measure of the enzyme activity. 

5 

Materials: 
Water bath 

Oven with mechanical air circulation (Memmert ULE 400) 

Incubator/shaker with amplitude of 0.25 cm (Multitron), equipped with MTP-holders and 
10 aluminum covers and bottoms 

Biomek FX liquid-handling system (Beckman) 

Micro plate reader (Molecular Devices Spectramax 340, SOFTmax Pro Software) 
Nichiryo 8800 multi channel syringe dispenser + syringes 
Micro titer plate tape 
15 Single and multi channel pipettes with tips 
Grade A medium eggs 

CaCI 2 .2H 2 0 (Merck 102382); MgCI 2 .6H 2 0 (Merckl 05833); Na 2 C0 3 (Merck 6392) 
ADW product: 

LH-powder (= Light House) 



Procedure: 

2s Three eggs were stirred with a fork in a glass beaker and 100 ml. milk (at 4°C or 

room temperature) was added. The beaker was placed in an 85°C water bath, and the 
mixture was stirred constantly with a spoon. As the mixture became thicker, care was taken 
to scrape the solidifying material continuously from the walls and bottom of the beaker. 
When the mixture was slightly runny (after about 25 minutes) the beaker was removed from 

30 the bath. Another 40 ml milk was added to the mixture and blended with a hand mixer or 
blender for 2 minutes. The mixture was cooled to room temperature (an ice bath can be 
used). The substrate was then stirred with an additional amount of 5 to 15% water (usually 
7.5%). 

35 Test Method: 

First, 50pl of scrambled egg substrate were dispensed into each well of a MTP. The 
plates were allowed to dry at room temperature overnight (about 17 hours), baked in oven at 
80°C for 2 hours, then cooled to room temperature. 

ADW product solution was prepared by dissolving 2.85 g of LH-powder into 1L 
40 water. Only about 15 minutes dissolution time was needed and filtration of the solution was 
not needed. Then, 1.16 mL artificial hardness solution was added and 2120 mg Na 2 C0 3 
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was dissolved in the solution. 

Hardness solution was prepared by mixing 188.57g CaCI 2 .2H 2 0 and 86.92g 
MgCI 2 .6H 2 0 in 1 L demi water (equal to 1 .28 M Ca + 0.43 M Mg and totally 10000 gpg). The 
above-mentioned amounts of ADW, CaCI 2 and MgCI 2 were already proportionally increased 

5 values (200/1 90x) because of the addition of 10 pi supernatant to 190 pi ADW solution. 

ADW solution (190 pi) was added to each well of the substrate plate. The MTPs 
were processed by addinglO pi of supernatant to each well and sealing the'plate with tape. 
The plate was placed in a pre-warmed incubator/shaker and secured with a metal cover and 
clamp. The plate was then washed for 30 minutes at the appropriate temperature (50°C for 

10 US) at 700 rpm. The plate was removed from the incubator/shaker. With gentle up and 
down movements of the liquid, about 125 pi of the warm supernatant were transferred to an 
empty flat bottom plate. After cooling, exactly 100 pi of the dispersion was dispensed into 
the wells of an empty flat bottom plate. The absorbance at 405 nm was determined using a 
microtiter plate reader. 

15 

Calculation of the Scrambled Egg Hydrolyzing Activity: 

The obtained absorbance value was corrected for the blank value (substrate without 
enzyme). The resulting absorbance is a measure for the hydrolytic activity. For each 
sample (variant) the performance index was calculated. The performance index compares 

20 the performance of the variant (actual value) and the standard enzyme (theoretical value) at 
the same protein concentration. In addition, the theoretical values can be calculated, using 
the parameters of the Langmuir equation of the standard enzyme. A performance index (PI) 
that is greater than 1 (Pl>1) identifies a better variant (as compared to the standard [e.g., 
wild-type]), while a PI of 1 (Pl=1) identifies a variant that performs the same as the standard, 

25 and a PI that is less than 1 (Pl<1) identifies a variant that performs worse than the standard. 
Thus, the PI identifies winners, as well as variants that are less desirable for use under 
certain circumstances. 



EXAMPLE 2 

Production of 69B4 protease From the Gram-Positive Alkaliphilic Bacterium 69B4 

This Example provides a description of the Cellulomonas strain 69B4 used to initially 
35 isolate the novel protease 69B4 provided by the present invention. The alkaliphilic micro- 
organism Cellulomonas strain 69B.4, (DSM 16035) was isolated at 37°C on an alkaline 
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casein medium containing (g L' 1 ) (See e.g., Duckworth etai, FEMS Microbiol. Ecol., 19:181- 
191 [1996]). 



Glucose (Merck 1.08342) 


10 


Peptone (Difco0118) 


5 


Yeast extract (Difco 0127) 


5 


K 2 HP0 4 


1 


MgS0 4 .7H 2 0 


0.2 


NaCI 


40 


Na 2 C0 3 


10 


Casein 


20 


Agar 


20 



An additional alkaline cultivation medium (Grant Alkaliphile Medium) was also used 
15 to cultivate Cellulomonas strain 69B.4, as provided below: 

Grant Alkaliphile Medium ("GAM") solution A (g L" 1 ) 
Glucose (Merck 1.08342) 10 
Peptone (Difco 01 18) 5 
» Yeast extract (Difco 0127) 5 
K2HPO4 1 
MgSO 4 .7H 2 0 0.2 

Dissolved in 800 ml distilled water and sterilized by autoclaving 

25 GAM solution B (g L 1 ) 
NaCI 40 
Na 2 C0 3 10 

Dissolved in 200 ml distilled water and sterilized by autoclaving. 

30 Complete GAM medium was prepared by mixing Solution A (800 ml) with Solution. B 

(200 ml). Solid medium is prepared by the addition of agar (2% w/v). 

Growth Conditions 

From a freshly thawed glycerol vial of culture (stored as a frozen glycerol (20% vA/, 
35 stock stored at -80*C), the micro-organisms were inoculated using an inoculation loop on 
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Grant Alkaliphile Medium (GAM) described above in agar plates and grown for at least 2 
days at 37 C. One colony was then used to inoculate a 500 ml shake flask containing 100 
ml of GAM at pH 10. This flask was then incubated at 37*C in a rotary shaker at 280 rpm for 
1-2 days until good growth (according to visual observation) was obtained. Then, 100 ml of 
broth culture was subsequently used to inoculate a 7 L fermentor containing 5 liters of GAM. 
The fermentations were run at 37'C for 2-3 days in order to obtain maximal production of 
protease. Fully aerobic conditions were maintained throughout by injecting air, at a rate of 5 
L/min, into the region of the impeller, which was rotating at about 500 rpm. The pH was set 
at pH 10 at the start, but was not controlled during the fermentation. 

Preparation of 69B4 Crude Enzyme Samples 

Culture broth was collected from the fermentor, and cells were removed by 
centrifugation for 30 min at 5000 x gat 10 e C. The resulting supernatant was clarified by 
depth filtration over Seitz EKS (SeitzSchenk Filtersystems). The resulting sterile culture 
supernatant was further concentrated approximately 10 times bv-idira filtration using an ultra 
filtration cassette with a 10kDa cut-off (Pall Omega 10kDa Minisette; Pall). The resulting 
concentrated crude 69B4 samples were frozen and stored at -20°C until further use. 
Purification 

The cell separated culture broth was dialyzed against 20mM (2-(4-morpholino)- 
ethane sulfonic acid ("MES") ,pH 5.4, 1mM CaCI 2 using 8K Molecular Weight Cut Off 
(MWCO) Spectra-Por7 (Spectrum) dialysis tubing. The dialysis was performed overnight or 
until the conductivity of the sample was less than or equal to the conductivity of the MES 
buffer. The dialyzed enzyme sample was purified using a BioCad VISION(Applied 
Biosystems) with a 10x100mm(7.845 mL) POROS High Density Sulfo-propyl (HS) 20 
(20micron) cation-exchange column (PerSeptive Biosystems). After loading the enzyme on 
the previously equilibrated column at 5mL/min, the column was washed at 40mL/min with a 
pH gradient from 25mM MES, pH 6.2, 1mM CaCI 2 to 25mM (N-[2-hydroxyethyl] piperazine- 
N*-[2-ethane] sulfonic acid [C8H 18 N 2 0 4 S, CAS # 7365-45-9]) ("HEPES") pH 8.0,1 mM CaCI 2 
in 25 column volumes. Fractions (8mL) were collected across the run. The pH 8.0 wash 
step was held for 5 column volumes and then the enzyme was eluted using a gradient (0- 
100 mM NaCI in the same buffer in 35 column volumes). Protease activity in the fractions 
was monitored using the pNA assay (sAAPF-pNA assay; DelMar, eial, supra). Protease 
activity which eluted at 40mM NaCI was concentrated and buffer exchanged(using a 5K 
MWCO VIVA Science 20mL concentrator) into 20mM MES, pH 5.8, 1mMCaCI2. This 
material was used for further characterization of the enzyme. 
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EXAMPLE 3 

PCR Amplification of a Serine Protease Gene Fragment 

In this Example, PCR amplification of a serine protease gene fragment is described. 

5 

Degenerate Primer Design 

Based on alignments of published serine protease amino acid sequences, a range of 
degenerate primers were designed against conserved structural and catalytic regions. Such 
regions included those that were highly conserved among the serine proteases, as well as 

10. those known to be important for enzyme structure and function. 

During the development of the present invention, protein sequences of nine 
published serine proteases (Streptogrisin C homologues) were aligned, as shown in below. 
The sequences were Streptomyces griseus Streptogrisin C (accession no. P52320); alkaline 
serine protease precursor from Thermobtfida fusca (accession no. AAC23545); alkaline 

is proteinase (EG 3.4.21.-) from Streptomyces sp. (accession no. PC2053); alkaline serine 
proteinase I from Streptomyces sp. (accession no. S34672); serine protease from 
Streptomyces lividans (accession no. CAD4208); putative serine protease from 
Streptomyces coelicolor A3(2) (accession no. NP_625129); putative serine protease from 
Streptomyces avermitilis MA-4680 (accession no. NPJJ22175); serine protease from 

20 Streptomyces lividans (accession no. CAD42809); putative serine protease precursor from 
Streptomyces coelicolor A3(2) (accession no. NP_628830). All of these sequences are 
publicly available from GenBank. These alignments are provided below. In this alignment, 
two conserved boxes are underlined and shown in bold. 



25 AAC23545 ( 1 ) . — MNHSSR— RTTSLLFTAALAATALVAATTPAS 

PC2053 (1) — MRHTGR-N^IGAAIAASALAPALVPSQAAAN DTLTERAEAAV 

S34672 (1) - -MRIiKGRTVAIGSALAASALALSLVPANASSELP SAETAKADALV 

CAD42808 (1) MVGRHAAR- SRRAALTALGALVLTALPSAASAAPPPVPGPRPAVARTPDA 

NP_625129 (1) MVGRHAAR- SRRAALTALGALVLTALPSAASAAPPPVPGPRPAVARTPDA 

30 NP_822175 (1) MVHRHVG — AGCAGLSVLATLVLTGLPAAAAIBPP-GPAPAPSAVQPLGA 

CAD42809 (1) MPHRHRHH - RAVGAAVAATAALLVAGLSGSAS AGTAPAG SAPTAAETLRT 

NP_628830 (1) M PHRHRHH - RAVGAAVAATAALLVAGL SG S AS AGTAPAG SAPTAAETLRT 

P52320 (1) ---MERTT-LRRRALVAGTATVAVGALA^ 

35 51 100 

AAC23545 (31) AQELALKRDI/^SDAEV7VELRAAEAEAVELEEELRDSIiGSDFGGV 

PC2053 (42) ADLPAGVLDAMERDLGLSEQEAGLKLVAEHDAALI/SETLSADIjDAFAGSW 

S34672 (45) EQLPAGMVDAMER0LGVPAAEVGNQLVAEHEAAVLEESLSEDLSGYAGSW 

CAD42808 (50) ATAPARMLSAMERDLRLAPGQAAARPVNEAEAGTRAGMIJUn'LGDRFAGA 

40 NP_625129 (50) ATAPARMLSAMERDLRIAPGQAAARLVNEAEAGTRAGMIJOTIX5DRFAGA 

NP_822175 (48) GOTSTAVLGAIiQRDLHLTDTQAKTRLVNEMEAGT^ 

CAD42809 (50) DAAP PALLKAMQRDLG I DRRQ AERRLVNEAEAGATAGRLRAALGGDFAGA 

NP_628830 (50) DAAPPALLKAMQRDLGIJ)RRQAERRLVNEAEAGATAGRLRAALGGDFAGA 

P52320 (47) DSLS PGMLAALERDI/SIjDEDAARSRIANEYRAAAV7^GIjEKSLGARYAGA 



45 



101 150 
AAC23545 (76) YLD ADT - TE I TVAVTDPAA VSR VDADDVTVDVVDFGETALNDFVASL^ 
PC2053 (92) LAEGT EL\A/ATTS EAEAAEITRAGATAEVVDHTLAELDS VKDALDTA 
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10 



15 



20 



25 



30 



40 



45 



50 



60 



65 



70 



75 



S34672 (95) I VEGTS — EHWATTDRAEAAE I TAAGATATWEHSLAELEAVKDI LDEA 

CAD42808 (100) WVSGATS AELTVATTDAADTAAI EAQGAKAAVVGRNLAELRAVKEKLDAA 

NP_625129 (100) WVSG ATSAELTVATTDAADT AAI EAQGAKAAVVGRNLAELRAVKEKLDAA 

NP_822175 (98) WVHGAASADLTVATTHATDI PAITAGGATAVWKTGLDDLKGAKKKLDSA 

CAD42809 (100) WVRGAE SGT LTVATT D AGD VAA VEARGAE AKWRH S LADLDAAKARLDT A 

NP_628830 (100) WVRG AESGTLTVATTD AGDVAAI EARGAEAKWRH SLADLDAAKARLDTA 

P52320 (97) RVSGAK-ATLWATTDASEAARITEAGARAEVVGHSLDRFEGVKKSLDKA 

151 200 

AAC23545 (125) ADT — ADPKVTGWYTDLESDAWI TTLRGGT PAAEELAERAGLDERAVRI 

PC2053 (139) AES - YDTTDAPVWYVDVTTNGVVLLTSD — VTEAEGFVEAAGVNAAAVDI 

S34672 (143) ATA-NPEDAAPVWYVDVTTNEWVLASD — VPAAEAFVAASGADASTVRV 

CAD42808 (150) AVR - TRTRQT P VWYVDVKTNRVTVQ ATG — A S AAAAFVEAAGVP AADVGV 

NP__625129 (150) A VR~ TRTRQT PWYVDVKTNRVTVQATG-- AS AAAAFVEAAGVP AADVGV 

NP_822175 (148) VAHGGTAVNT P VR YVDVRTNR VTLQ ARS - -RAAADALIAAAGVDSGLVDV 

CAD42809 (150) AAG -LNTADAPVWYVDTRTNTWVEAIR- - PAAARSLLTAAGVDGSLAHV 

NP_628830 (150) AAG-LNTADAPVWYVDTRTNTVWEAIR — PAAARSLLTAAGVDGSLAHV 

P52320 (146) ALD-KAPKNVPVWYVDVAANRVVVNAAS — PAAGQAFLKVAGVDRGLVTV 

201 250 

AAC23545 (173) VEEDEEPQS LAAI I GGNPYYFGN- YRCS I GFSVRQGSQTGPATAGHCGST 

PC2053 (186) QTSDEQPQAF YDLVGGDAYYMGG -GRC SVGF S VTQGSTPGFATAGHCGTV 

S34672 (190) ERSDESPQPFYDLVGGDAYYIGN-GRCSIGFSVRQGSTPGFVTAGHCGSV 

CAD42808 (197) RVSPDQPRVLEDLVGGDAYYIDDQARCSIGFSVTKDDQEGFATAGHCGDP 

NP_625129 (197) RVS PDQPRVLEDLVGGDAYYI DDQARC S IGFS VTKDDQEGFATAGHCGDP 

NP_822175 (196) KVS EDRPRAL FDI RGGD A YY I DNTARC SVGFSVTKGNQQGFATAGH CGRA 

CAD42809 (197) KNRTERPRTF YDLRGGEA YYINNS SRCS IGFP I TKGTQQGFATAGHCDRA 

NP_628830 (197) KNRTERPRTF YDLRGGEA YYINNS SRCS IGFP I TKGTQQGF AT AGH(X3RA 

P52320 (193) AR S AEQ PRALADI RGGDA YYMNG S GRC SVG FSVTRGTQNG F ATAGH CGRV 

251 ' 300 

AAC23545 (222) GTRVS S P SGTVAG SYF PGRDMGWVRI TS ADTVT PL VNR YNGGTVTV 

PC2053 (235) GTSTTGYNQAAQGTFEESSFPGDDMAWSVNSDWNTTPTVNE---GE-VTV 

S3 4 672 (239) GNATTGFNRVSQGTFRGSWFPGRDMAWVAVNSNWTPTSLVRNS -GSGVRV 

CAD42808 (247) GATTTGYNEADQGTFQASTFPGK1M1AVA/GVNSDWTATPDVKAEGG 

NP_625129 (247) GATTTGYNEADQGTFQASTFPGKDMAWVGVNSDWTATPDVKAEGGEKIQL 

NP_822175 (246) GAPTAGFTOVAQGTVQASWPGHDMAWVGVNSDWTATPDVAGAAGQNVSI 

CAD42809 (247) G S STTGANRVAQGTFQG S I FPGRDMAWVATN S SWTAT P YVLGAGGQNVQV 

NP__628830 (247) GS STTGANRVAQGTFQGSI FPGRM^VA/ATNSSVWATP YVLGAGGQNVQV 

P^2320 (243) GTTTNGVNQQAQGTFQGSTFPGRDIAWVATNANWTPRPLWGYGRGDVTV 

301 350 

AAC23545 (268) TG SQEAATGS S VCRSGATTGWRCGTI QSKNQTVRYAEGTVTGLTRTTACA 

PC2053 (282) SGSTEAAVGASICRSGSTTGraCGTIQQHNTSWYPEGTITG^miTSVCA 

S34672 (288) TG STQATVGS S I CRSG STT6WRCGTI QQHNTSVTY PQGTI TGVTRTS ACA 

CAD42808 (297) AGSVEALVGASVCRSGSTTGWHCGTI QQHDTSVTYPEGTVDGLTGTTVCA 

NP_625129 (297) AG SVEALVGASVCR5GSTTGWHC6TI QQHDTSVTYPEGTVDGLTETTVCA 

NP_822175 (296) AG SVQAI VGAAI CRSG STTGiraCGTVEEHDTSVTYEEGTVDGLTRTTVCA 

CAD42809 (297) TGSTASPVGASVCRSGSTTGWHCGTVTQLNTSWYQEGTI SPVTRTTVCA 

NP_628830 (297) TGSTAS PVGASVCRSGSTTGWHCGTVTQLOTSVTYQEGTISPVTRTTVCA 

P52320 (293) AGSTASWGASVCRSGS TTGVmCGT IQQLNTSVTYPEGTISGVTRTSVCA 

351 400 

AAC23545 (318) EGGDSGGPWLTGSQAQGVTSGGTGDCRSGGITFFQPINPLLSYFGLQLVT 

PC2053 (332) EPGDSGGSYISGSQAQGVTSGGSGNCTSGGTTYHQPINPLLSAYGLDLVT 

S34672 (338) QPGDSGGSFISGTQAQGVTSGGSGNCSIGGTTFHQPVNPILSQYGLTLVR 

CAD42808 (347) EPGDSGGPFVSGVQAQGTTSGGSGDCTNGGTTFYQPVNPLLSDFGLTLKT 

NP_625129 (347) EPGDSGGPFVSGVQAQGTTSGGSGDCTNGGTTFYQPVNPLLSDF<3LTLKT 

NP_822175 (346) EPGDSGGSFVSGSQAQGVTSGGSGDCTRGGTTYYQPVNPILSTYGLTLKT 

CAD42809 (347) EPGDSGGSFI SGSQAQGVTSGGSGDC^TGGGTFT^PINALIXmGLTLKT 

NP_628830 (347) EPGDSGGSFISGSQAQGVTSGGSGDCRTGGETFTQPINALLQNYGLTLKT 

P52320 (343) E PGDSGG SYISGSQAQGVTSGGSGNCSSGGTTYFOPINPLLQAYGLTLVT 

401 450 

AAC23545 (368) G - 

PC2053 (382) G — 

S34672 (388) S 

CAD42808 (397) TSAATQTPAPQDNAAA DAWTAGRVYEVGTTVSYDGVRYRCLQSH 

NP_625129 (397) TSAATQTPAPQDNAAA DAWTAGRVYEVGTTVSYDGVRYRCLQSH 

NP_822175 (396) STAPTDTPSDPVDQSG VWAAGRVYEVGAQVTYAGVTYQCLQSH 

CAD42809 (397) TGGDDGGGDDGG EEPGG-TWAAGTVYQPGDTVTYGGATFRCLQGH 

NP_628830 (397) TGGDDGGGDDGGGDDGGEE PGG -TWAAGTVY QPGDTVTYGGATFRCLQGH 

P52320 (393) SGGGTPTDPPTTPPTDSP- - -GGTWAVGTAYAAGATVTYGGATYRCLQAH 

451 468 

AAC23545 (369) - (SEQ ID NO: 648) 

PC2053 (383) . (SEQ ID NO: 649) 
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S34672 (389) 

CAD42808 (441) QAQGVGSPASVPALWQRV 

NP_625129 (441) QAQGVGSPASVPALWQRV 

NP_822175 (439) QAQGVWQPAATPALWQRL 

CAD42809 (441) QAYAGWEPPNVPALWQRV 

NP_628830 (446) QAYAGWEPPNVPALWQRV 

P52320 (440) TAQPGWTPADVPALWQRV 



(SEQ ID NO: 650) 
(SEQ ID NO: 651) 
(SEQ ID NO: 652) 
(SEQ ID NO: 653) 
(SEQ ID NO: 654) 
(SEQ ID NO: 655) 
(SEQ ID NO: 656) 



10 Two particular regions were chosen to meet the criteria above, and a forward and a 

reverse primer were designed based on these amino acid regions. The specific amino acid 
regions used to design the primers are highlighted in black in the sequences shown in the 
alignments directly above. Using the genetic code for codon usage, degenerate nucleotide 
PCR primers were synthesized by MWG-Biotech. The degenerate primer sequences 

15 produced were: 



forward primer TTGWXCGT.FW: 5' ACNACSGGSTGGCRGTGCGGCAC 3' (SEQ ID 
NO:10) 

reverse primer GDSGGXJW: 5 , -ANGNGCCGCCGGAGTCNCC-3 , (SEQ ID NO:1 1) 

20 

As all primers were synthesized in the 5'-3' direction and standard IUB code for 
mixed base sites was used (e.g., to designate U N" for A/C/T/G). Degenerate primers 
TTGWXCGT_FW and GDSGGXJW successfully amplified a 177 bp region from 
Cellulomonas sp. isolate 69B4 by PCR, as described below. 

25 

PCR Amplification of a Serine Protease Gene Fragment 

Cellulomonas sp. isolate 69B4 genomic DNA was used as a template for PCR 
amplification of putative serine protease gene fragments using the above-described primers. 
PCR was carried out using High Fidelity Platinum Taq polymerase (Catalog number 1 1 304- 

30 102; Invitrogen). Conditions were determined by individual experiments, but typically thirty 
cycles were run in a thermal cycler (MJ Research). Successful amplification was verified by 
electrophoresis of the PCR reaction on a 1% agarose TBE gel. A PCR product that was 
amplified from Cellulomonas sp. 69B4 with the primers TTG WXCGT_FW and 
GDSGGXJW was purified by gel extraction using the Qiaquick Spin Gel Extraction kit 

35 (Catalogue 28704; Qiagen) according to the manufacturer's instructions. The purified PCR 
product was cloned into the commercially available pCR2.1TOPO vector System 
(Invitrogen) according to the manufacturer's instructions, and transformed into competent 
E.coli TOP10 cells. Colonies containing recombinant plasmids were visualized using 
blue/white selection. For rapid screening of recombinant transformants, plasmid DNA was 
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prepared from cultures of putative positive (i.e., white) colonies. DNA was isolated using 
the Qiagen plasmid purification kit, and was sequenced by Baseclear. One of the clones 
contained a DNA insert of 177 bp that showed some homology with several streptogrisin-like 
protease genes of various Streptomyces species and also with serine protease genes from 
5 other bacterial species. The DNA and protein coding sequence of this 177 bp fragment is 
provided in Fig. 13. 

Sequence Analysis 

The sequences were analyzed by BLAST and other protein translation sequence 
10 tools. BLAST comparison at the nucleotide level showed various levels of identity to 
published serine protease sequences. Initially, nucleotide sequences were submitted to 
BLAST (Basic BLAST version 2.0). The program chosen was "BlastX", and the database 
chosen was "nr." Standard/default parameter values were employed. Sequence data for 
putative Cellulomonas 69B4 protease gene fragment was entered in FASTA format and the 
is query submitted to BLAST to compare the sequences of the present invention to those 
already in the database. The results returned for the 177 bp fragment a high number of hits 
for protease genes from various Streptomyces spp., including S. griseus, S. lividans, S. 
coelicolor, S. albogriseolus, S. platensis, S. fradiae, and Streptomyces sp. It was concluded 
that further analysis of the 177 bp fragment cloned from Cellulomonas sp. isolate 69B4 was 
20 desired. 



EXAMPLE 4 

Isolation of a Polynucleotide Sequence from the Genome 
25 of Cellulomonas 69B4 Encoding a Serine Protease by Inverse PGR 

In this Example, experiments conducted to isolate a polynucleotide sequence 
encoding a serine protease produced by Cellulomonas sp. 69B4 are described. 

30 Inverse PCR of Cellulomonas sp. 69B4 Genomic DNA to Isolate the Gene Encoding 
Cellulomonas strain 69B4 Protease 

Inverse PCR was used to isolate and clone the full-length serine protease gene from 
Cellulomonas sp. 69B4. Based on the DNA sequence of the 177 bp fragment of the 
Cellulomonas protease gene described in Example 3, novel DNA primers were designed: 



35 
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69B4inLRV1 5'-CGGGGTAGGTGACCGAGGAGTTGAGCGCAGTG-3' (SEQIDNO:14) 
69B4int_FW2 5'-GCTCGCCGGCAACCAGGCCCAGGGCGTCACGTC-3' (SEQIDNO:15) 

Chromosomal DNA of Cellulomonas sp. 69B4 was digested with the restriction 
enzymes Apa\, SamHI, BssHU, Kpn\, A/art, A/col, Nhel PvtA, Sail or Ssfll, purified using the 
Qiagen PCR purification kit (Qiagen, Catalogue # 28106) and self-ligated with T4 DNA 
ligase (Invitrogen) according to the manufacturers' instructions. Ligation mixtures were 
purified using the Qiagen PCR purification kit, and PCR was performed with primers 
69B4int_RV1 and 69B4intJ r W2. PCR on DNA fragments that were digested with A/col, and 
then self-ligated, resulting in a PCR product of approximately 1.3 kb. DNA sequence 
analysis (BaseClear) revealed that this DNA fragment covers the main part of a 
streptogrisin-like protease gene from Cellulomonas. This protease was designated as 
"69B4 protease," and the gene encoding Cellulomonas 69B4 protease was designated as 
the u asp gene." The entire sequence of the asp gene was derived by additional inverse 
PCR reactions with primer 69B40int_FW2 and an another primer: 69B4-for4 (5' AAC GGC 
GGG TTC ATC ACC GCC GGC CAC TGC GGC C 3' {SEQ ID NO:16). Inverse PCR with 
these primers on Nco\, BssHH, Apa\ and PviA digested and self-ligated DNA fragments of 
genomic DNA of Cellulomonas sp. 69B4 resulted in the identification of the entire sequence 
of the asp gene. 

Nucleotide and Amino Acid Sequences 

For convenience, various sequences are included below. First, the DNA sequence 
of the asp gene (SEQ ID NO:1) provided below encodes the signal peptide (SEQ ID NO:9) 
and the precursor serine protease (SEQ ID NO:7) derived from Cellulomonas strain 69B4 
(DSM 16035). The initiating polynucleotide encoding the signal peptide of the Cellulomonas 
strain 69B4 protease is in bold (ATG). 



1 


GCGCGCTGCG 


CCCACGACGA 


CGCCGTCCGC 


CGTTCGCCGG 


CGTACCTGCG 


TTGGCTCACC 




CGCGCGACGC 


GGGTGCTGCT 


GCGGCAGGCG 


GCAAGOGGCC 


GCATGGACGC 


AACCGAGTGG 


61 


ACCCACCAGA 


TCGACCTCCA 


TAACGAGGCC 


GTATGACCAG 


AAAGGGATCT 


GCCACCGCCC 




TGGGTGGTCT 


AGCTGGAGGT 


ATTGCTCCGG 


CATACTGGTC 


TTTCCCTAGA 


CGGTGGCGGG 


121 


ACCAGCACGC 


TCCTAACCTC 


CGAGCACCGG 


CGACCGCCGG 


GTGCGATGAA 


AGGGACGAAC 




TGGTCGTGCG 


AGGATTGGAG 


GCTCGTGGCC 


GCTGGCGGCC 


CACGCTACTT 


TCCCTGCTTG 


181 


CGAGATGACA 


CCACGCACAG 


TCACGCGGGC 


CCTGGCCGTG 


GCCACCGCAG 


CCGCCACACT 




GCTCTACTGT 


GGTGCGTGTC 


AGTGCGCCCG 


GGACCGGCAC 


CGGTGGCGTC 


GGCGGTGTGA 


241 


CCTGGCAGGC 


GGCATGGCCG 


CCCAGGCCAA 


CGAGCCCGCA 


CCACCCGGGA 


GCGCGAGCGC 




GGACCGTCCG 


CCGTACCGGC 


GGGTCCGGTT 


GCTCGGGCGT 


GGTGGGCCCT 


CGCGCTCGCG 


301 


ACCGCCACGC 


CTGGCCGAGA 


AGCTCGACCC 


CGACCTCCTC 


GAGGCCATGG 


AGCGCGACCT 




TGGCGGTGCG 


GACCGGCTCT 


TCGAGCTGGG 


GCTGGAGGAG 


CTCCGGTACC 


TCGCGCTGGA 


361 


GGGCCTCGAC 


GCGGAGGAAG 


CCGCCGCCAC 


CCTGGCGTTC 


CAGCACGACG 


CAGCCGAGAC 




CCCGGAGCTG 


CGCCTCCTTC 


GGCGGCGGTG 


GGACCGCAAG 


GTCGTGCTGC 


GTCGGCTCTG 



WO 2005/052146 PCT/US2004/039066 

-126- 

421 CGGCGAGGCC CTCGCCGAAG AGCTCGACGA GGACTTCGCC GGCACCTGGG TCGAGGACGA 

GCCGCTCCGG GAGCGGCTTC TCGAGCTGCT CCTGAAGCGG CCGTGGACCC AGCTCCTGCT 
481 CGTCCTGTAC GTCGCCACCA CCGACGAGGA CGCCGTCGAG GAGGTCGAGG GCGAAGGCGC 

GCAGGACATG CAGCGGTGGT GGCTGCTCCT GCGGCAGCTC CTCCAGCTCC CGCTTCCGCG 
5 541 CACGGCCGTC ACCGTCGAGC ACTCCCTGGC CGACCTCGAG GCCTGGAAGA CCGTCCTCGA 

GTGCCGGCAG TGGCAGCTCG TGAGGGACCG GCTGGAGCTC CGGACCTTCT GGCAGGAGCT 
601 CGCCGCCCTC GAGGGCCACG ACGACGTGCC CACCTGGTAC GTCGACGTCC CGACCAACAG 

GCGGCGGGAG CTCCCGGTGC TGCTGCACGG GTGGACCATG CAGCTGCAGG GCTGGTTGTC • 
661 CGTCGTCGTC GCCGTCAAGG CCGGAGCCCA GGACGTCGCC GCCGGCCTCG TCGAAGGTGC 

10 GCAGCAGCAG CGGCAGTTCC GGCCTCGGGT CCTGCAGCGG CGGCCGGAGC AGCTTCCACG 

721 CGACGTCCCG TCCGACGCCG TGACCTTCGT CGAGACCGAC GAGACCCCGC GGACCATGTT 

GCTGCAGGGC AGGCTGCGGC ACTGGAAGCA GCTCTGGCTG CTCTGGGGCG CCTGGTACAA 
781 CGACGTGATC GGCGGCAACG CCTACACCAT CGGGGGGCGC AGCCGCTGCT CGATCGGGTT 

GCTGCACTAG CCGCCGTTGC GGATGTGGTA GCCCCCCGCG TCGGCGACGA GCTAGCCCAA 
15 841 CGCGGTCAAC GGCGGGTTCA TCACCGCCGG CCACTGCGGC CGCACCGGCG CCACCACCGC 

GCGCCAGTTG CCGCCCAAGT AGTGGCGGCC GGTGACGCCG GCGTGGCCGC GGTGGTGGCG 
901 CAACCCCACC GGGACCTTCG CCGGGTCCAG CTTCCCGGGC AACGACTACG CGTTCGTCCG 

GTTGGGGTGG CCCTGGAAGC GGCCCAGGTC GAAGGGCCCG TTGCTGATGC GCAAGCAGGC 
961 TACCGGGGCC GGCGTGAACC TGCTGGCCCA GGTCAACAAC TACTCCGGTG GCCGCGTCCA 

20 ATGGCCCCGG CCGCACTTGG ACGACCGGGT CCAGTTGTTG ATGAGGCCAC CGGCGCAGGT 

1021 GGTCGCCGGG CACACCGCGG CCCCCGTCGG CTCGGCCGTG TGCCGGTCCG GGTCGACCAC 

CCAGCGGCCC GTGTGGCGCC GGGGGCAGCC GAGCCGGCAC ACGGCCAGGC CCAGCTGGTG 
1081 CGGGTGGCAC TGCGGCACCA TCACTGCGCT CAACTCCTCG GTCACCTACC CCGAGGGCAC 

GCCCACCGTG ACGCCGTGGT AGTGACGqGA GTTGAGGAGC CAGTGGATGG GGCTCCCGTG 
25 1141 CGTCCGCGGC CTGATCCGCA CCACCGTCTG CGCCGAGCCC GGCGACTCCG GTGGCTCGCT 

GCAGGCGCCG GACTAGGCGT GGTGGCAGAC GCGGCTCGGG CCGCTGAGGC CACCGAGCGA 
1201 GCTCGCCGGC AACCAGGCCC AGGGCGTCAC GTCCGGCGGC TCCGGCAACT GCCGCACCGG 

CGAGCGGCCG TTGGTCCGGG TCCCGCAGTG CAGGCCGCCG AGGCCGTTGA CGGCGTGGCC 
1261 TGGCACCACG TTCTTCCAGC CGGTCAACCC CATCCTCCAG GCGTACGGCC TGAGGATGAT 

30 ACCGTGGTGC AAGAAGGTCG GCCAGTTGGG GTAGGAGGTC CGCATGCCGG ACTCCTACTA 

1321 CACCACGGAC TCGGGCAGCA GCCCGGCCCC TGCACCGACC TCCTGCACCG GCTACGCCCG 

GTGGTGCCTG AGCCCGTCGT CGGGCCGGGG ACGTGGCTGG AGGACGTGGC CGATGCGGGC 
1381 CAOCTTCACC GGGACCCTCG CGGCCGGCCG GGCCGCCGCC CAGCCCAACG GGTCCTACGT 

GTGGAAGTGG CCCTGGGAGC GCCGGCCGGC CCGGCGGCGG GTCGGGTTGC CCAGGATGCA 
35 1441 GCAGGTCAAC CGGTCCGGGA CCCACAGCGT GTGCCTCAAC GGGCCCTCCG GTGCGGACTT 

CGTCCAGTTG GCCAGGCCCT GGGTGTCGCA CACGGAGTTG CCCGGGAGGC CACGCCTGAA 
1501 CX3ACCTCTAC GTGCAGCGCT GGAACGGCAG CTCCTGGGTG ACCGTCGCCC AGAGCACCTG 

GCTGGAGATG CACGTCGCGA CCTTGCCGTC GAGGACCCAC TGGCAGCGGG TCTCGTGGAG 
1561 CCCCGGCTCC AACGAGACCA TCACCTACCG CGGCAACGCC GGCTACTACC GCTACGTGGT 

40 GGGGCCGAGG TTGCTCTGGT AGTGGATGGC GCCGTTGCGG CCGATGATGG CX3ATGCACCA 

1621 CAACGCCGCG TCCGGCTCCG GTGCCTACAC CATGGGGCTC ACCCTCCCCT GACGTAGCGC 

GTTGCGGCGC AGGCCGAGGC CACGGATGTG GTACCCCGAG TGGGAGGGGA CTGCATCGCG (SEQ ID NO:l) 

45 The following DNA sequence (SEQ ID NO:2) encodes the signal peptide (SEQ ID 

NO:9) that is operatively linked to the precursor protease (SEQ ID NO:7) derived from 
Cellulomonas strain 69B4 (DSM 16035). The initiating polynucleotide encoding the signal 
peptide of the Cellulomonas strain 69B4 protease is in bold (ATG). The asterisk indicates 
the termination codon (TGA), beginning with residue 1486. Residues 85, 595, and 1 162, 

so relate to the initial residues of the N terminal prosequence, mature sequence and Carboxyl 
terminal prosequence, respectively, are bolded and underlined. 
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1 ATGACACCAC GCACAGTCAC GCGGGCCCTG GCCGTGGCCA CCGCAGCCGC CACACTCCTG 

TACTGTGGTG CGTGTCAGTG CGCCCGGGAC CGGCACCGGT GGCGTCGGCG GTGTGAGGAC 

.85 

61 GCAGGCGGCA TGGCCGCCCA GGCCAACGAG CCCGCACCAC CCGGGAGCGC GAGCGCACCG 

5 CGTCCGCCGT ACCGGCGGGT CCGGTTGCTC GGGCGTGGTG GGCCCTCG CG CTCGCGTGGC 

121 CCACGCCTGG CCGAGAAGCT CGACCCCGAC CTCCTCGAGG CCATGGAGCG CGACCTGGGC 

GGTGCGGACC GGCTCTTCGA GCTGGGGCTG GAGGAGCTCC GGTACCTCGC GCTGGACCCG 
181 CTCGACGCGG AGGAAGCCGC CGCCACCCTG GCGTTCCAGC ACGACGCAGC CGAGACCGGC 

GAGCTGCGCC TCCTTCGGCG GCGGTGGGAC CGCAAGGTCG TGCTGCGTCG GCTCTGGCCG 
10 241 GAGGCCCTCG CCGAAGAGCT CGACGAGGAC TTCGCCGGCA CCTGGGTCGA GGACGACGTC 

CTCCGGGAGC GGCTTCTCGA GCTGCTCCTG AAGCGGCCGT GGACCCAGCT CCTGCTGCAG 
301 CTGTACGTCG CCACCACCGA CGAGGACGCC GTCGAGGAGG TCGAGGGCGA AGGCGCCACG 

GACATGCAGC GGTGGTGGCT GCTCCTGCGG CAGCTCCTCC AGCTCCCGCT TCCGCGGTGC 
361 GCCGTCACCG TCGAGCACTC CCTGGCCGAC CTCGAGGCCT GGAAGACCGT CCTCGACGCC 

15 CGGCAGTGGC AGCTCGTGAG GG AC CGGCTG GAGCTCCGGA CCTTCTGGCA GGAGCTGCGG 

421 GCCCTCGAGG GCCACGACGA CGTGCCCACC TGGTACGTCG ACGTCCCGAC CAACAGCGTC 

CGGGAGCTCC CGGTGCTGCT GCACGGGTGG ACCATGCAGC TGCAGGGCTG GTTGTCGCAG 
481 GTCGTCGCCG TCAAGGCCGG AGCCCAGGAC GTCGCCGCCG GCCTCGTCGA AGGTGCCGAC 

CAGCAGCGGC AGTTCCGGCC TCGGGTCCTG CAGCGGCGGC CGGAGCAGCT TCCACGGCTG 
20 595 

541 GTCCCGTCCG ACGCCGTGAC CTTCGTCGAG ACCGACGAGA CCCCGCGGAC CATGTTCGAC 

CAGGGCAGGC TGCGGCACTG GAAGCAGCTC TGGCTGCTCT GGGGCGCCTG GTACAAGCTG 
601 GTGATCGGCG GCAACGCCTA CACCATCGGG GGGCGCAGCC GCTGCTCGAT CGGGTTCGCG 

CACTAGCCGC CGTTGCGGAT GTGGTAGCCC CCCGCGTCGG CGACGAGCTA GCCCAAGCGC 
'25 6 61 GTCAACGGCG GGTTCATCAC CGCCGGCCAC TGCGGCCGCA CCGGCGCCAC CACCGCCAAC 

CAGTTGCCGC CCAAGTAGTG GCGGCCGGTG ACGCCGGCGT GGCCGCGGTG GTGGCGGTTG 
721 CCCACCGGGA CCTTCGCCGG GTCCAGCTTC CCGGGCAACG ACTACGCGTT CGTCCGTACC 

GGGTGGCCCT GGAAGCGGCC CAGGTCGAAG GGCCCGTTGC TGATGCGCAA GCAGGCATGG 
781 GGGGCCGGCG TGAACCTGCT GGCCCAGGTC AACAACTACT CCGGTGGCCG CGTCCAGGTC 

30 CCCCGGCCGC ACTTGGACGA CCGGGTCCAG TTGTTGATGA GGCCACCGGC GCAGGTCCAG 

841 GCCGGGCACA CCGCGGCCCC CGTCGGCTCG GCCGTGTGCC GGTCCGGGTC GACCACCGGG 

CGGCCCGTGT GGCGCCGGGG GCAGCCGAGC CGGCACACGG CCAGGCCCAG CTGGTGGCCC 
0901 TGGCACTGCG GCACCATCAC TGCGCTCAAC TCCTCGGTCA CCTACCCCGA GGGCACCGTC 

ACCGTGACGC CGTGGTAGTG ACGCGAGTTG AGGAGCCAGT GGATGGGGCT CCCGTGGCAG 
35 0961 CGCGGCCTGA TCCGCACCAC CGTCTGCGCC GAGCCCGGCG ACTCCGGTGG CTCGCTGCTC 

GCGCCGGACT AGGCGTGGTG GCAGACGCGG CTCGGGCCGC TGAGGCCACC GAGCGACGAG 
1021 GCCGGCAACC AGGCCCAGGG CGTCACGTCC GGCGGCTCCG GCAACTGCCG CACCGGTGGC 

CGGCCGTTGG TCCGGGTCCC GCAGTGCAGG CCGCCGAGGC CGTTGACGGC GTGGCCACCG 
1081 ACCACGTTCT TCCAGCCGGT CAACCCCATC CTCCAGGCGT ACGGCCTGAG GATGATCACC 

40 TGGTGCAAGA AGGTCGGCCA GTTGGGGTAG GAGGTCCGCA TGCCGGACTC CTACTAGTGG 

1162 

1141 ACGGACTCGG GCAGCAGCCC GGCCCCTGCA CCGACCTCCT GCACCX3GCTA CGCCCGCACC 

TGCCTGAGCC CGTCGTCGGG CCGGGGACGT GGCTGGAGGA CGTGGCCGAT GCGGGCGTGG 
1201 TTCACCGGGA CCCTCGCGGC CGGCCGGGCC GCCGCCCAGC CCAACGGGTC CTACGTGCAG. 

45 AAGTGGCCCT GGGAGCGCCG GCCGGCCCGG CGGCGGGTCG GGTTGCCCAG GATGCACGTC 

1261 GTCAACCGGT CCGGGACCCA CAGCGTGTGC CTCAACGGGC CCTCCGGTGC GGACTTCGAC 

CAGTTGGCCA GGCCCTGGGT GTCGCACACG GAGTTGCCCG GGAGGCCACG CCTGAAGCTG 
1321 CTCTACGTGC AGCGCTGGAA CGGCAGCTCC TGGGTGACCG TCGCCCAGAG CACCTCCCCC 

GAGATGCACG TCGCGACCTT GCCGTCGAGG ACCCACTGGC AGCGGGTCTC GTGGAGGGGG 
50 1 381 GGCTCCAACG AGACCATCAC CTACCGCGGC AACGCCGGCT ACTACCGCTA CGTGGTCAAC 

CCGAGGTTGC TCTGGTAGTG GATGGCGCCG TTGCGGCCGA TGATGGCGAT GCACCAGTTG 

1486* 

1441 GCCGCGTCCG GCTCCGGTGC CTACACCATG GGGCTCACCC TCCCCTGA (SEQ ID NO: 2) 

CGGCGCAGGC CGAGGCCACG GATGTGGTAC CCCGAGTGGG AGGGGACT 



55 
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The following DNA sequence (SEQ ID NO:3) encodes the precursor protease 
derived from Cellulomonas strain 69B4 (DSM 1 6035). 

1 AACGAGCCCG CACCACCCGG GAGCGCGAGC GCACCGCCAC GCCTGGCCGA GAAGCTCGAC 

TTGCTCGGGC GTGGTGGGCC CTCGCGCTCG CGTGGCGGTG CGGACCGGCT CTTCGAGCTG 
5 61 CCCGACCTCC TCGAGGCCAT GGAGCGCGAC CTGGGCCTCG ACGCGGAGGA AGCCGCCGCC 

GGGCTGGAGG AGCTCCGGTA CCTCGCGCTG GACCCGGAGC TGCGCCTCCT TCGGCGGCGG 
121 ACCCTGGCGT TCCAGCACGA CGCAGCCGAG ACCGGCGAGG CCCTCGCCGA AGAGCTCGAC 

TGGGACCGCA AGGTCGTGCT GCGTCGGCTC TGGCCGCTCC GGGAGCGGCT TCTCGAGCTG 
181 GAGGACTTCG CCGGCACCTG GGTCGAGGAC GACGTCCTGT ACGTCGCCAC CACCGACGAG 

10 CTCCTGAAGC GGCCGTGGAC CCAGCTCCTG CTGCAGGACA TGCAGCGGTG GTGGCTGCTC 

241 GACGCCGTCG AGGAGGTCGA GGGCGAAGGC GCCACGGCCG ■ TCACCGTCGA GCACTCCCTG 

CTGCGGCAGC TCCTCCAGCT CCCGCTTCCG CGGTGCCGGC AGTGGCAGCT CGTGAGGGAC 
301 GCCGACCTCG AGGCCTGGAA GACCGTCCTC GACGCCGCCC TCGAGGGCCA CGACGACGTG 

CGGCTGGAGC TCCGGACCTT CTGGCAGGAG CTGCGGCGGG AGCTCCCGGT GCTGCTGCAC 1 
15 361 CCCACCTGGT ACGTCGACGT CCCGACCAAC AGCGTCGTCG TCGCCGTCAA GGCCGGAGCC 

GGGTGGACCA TGCAGCTGCA GGGCTGGTTG TCGCAGCAGC AGCGGCAGTT CCGGCCTCGG 
421 CAGGACGTCG CCGCCGGCCT CGTCGAAGGT GCCGACGTCC CGTCCGACGC CGTGACCTTC 

GTCCTGCAGC GGCGGCCGGA GCAGCTTCCA CGGCTGCAGG GCAGGCTGCG GCACTGGAAG 
481 GTCGAGACCG ACGAGACCCC GCGGACCATG TTCGACGTGA TCGGCGGCAA CGCCTACACC 

20 CAGCTCTGGC TGCTCTGGGG CGCCTGGTAC AAGCTGCACT AGCCGCCGTT GCGGATGTGG 

541 ATCGGGGGGC GCAGCCGCTG CTCGATCGGG TTCGCGGTCA ACGGCGGGTT CATCACCGCC 

TAGCCCCCCG CGTCGGCGAC GAGCTAGCCC AAGCGCCAGT TGCCGCCCAA GTAGTGGCGG 
601 GGCCACTGCG GCCGCACCGG CGCCACCACC GCCAACCCCA CCGGGACCTT CGCCGGGTCC 

CCGGTGACGC CGGCGTGGCC GCGGTGGTGG CGGTTGGGGT GGCCCTGGAA GCGGCCCAGG 
25 661 AGCTTCCCGG GCAACGACTA CGCGTTCGTC CGTACCGGGG CCGGCGTGAA CCTGCTGGCC 

TCGAAGGGCC CGTTGCTGAT GCGCAAGCAG GCATGGCCCC GGCCGCACTT GGACGACCGG 
721 CAGGTCAACA ACTACTCCGG TGGCCGCGTC CAGGTCGCCG GGCACACCGC GGCCCCCGTC 

GTCCAGTTGT TGATGAGGCC ACCGGCGCAG GTCCAGCGGC CCGTGTGGCG CCGGGGGCAG 
781 GGCTCGGCCG TGTGCCGGTC CGGGTCGACC ACCGGGTGGC ACTGCGGCAC CATCACTGCG 

30 CCGAGCCGGC ACACGGCCAG GCCCAGCTGG TGGCCCACCG TGAOGCCGTG GTAGTGACGC 

841 CTCAACTCCT CGGTCACCTA CCCCGAGGGC ACCGTCCGCG GCCTGATCCG CACCACCGTC 

GAGTTGAGGA GCCAGTGGAT GGGGCTCCCG TGGCAGGCGC CGGACTAGGC GTGGTGGCAG 
901 TGCGCCGAGC CCGGCGACTC CGGTGGCTCG CTGCTCGCCG GCAACCAGGC CCAGGGCGTC 

ACGCGGCTCG GGCCGCTGAG GCCACCGAGC GACGAGCGGC CGTTGGTCCG GGTCCCGCAG 
35 961 ACGTCCGGCG GCTCCGGCAA CTGCCGCACC GGTGGCACCA CGTTCTTCCA GCCGGTCAAC 

TGCAGGCCGC CGAGGCCGTT GACGGCGTGG CCACCGTGGT GCAAGAAGGT CGGCCAGTTG 
1021 CCCATCCTCC AGGCGTACGG CCTGAGGATG ATCACCACGG ACTCGGGCAG CAGCCCGGCC 

GGGTAGGAGG TCCGCATGCC GGACTCCTAC TAGTGGTGCC TGAGCCCGTC GTCGGGCCGG 
1081 CCTGCACCGA CCTCCTGCAC CGGCTACGCC CGCACCTTCA CCGGGACCCT CGCGGCCGGC 

40 GGACX3TGGCT GGAGGACGTG GCCGATGCGG GCGTGGAAGT GGCCCTGGGA GCGCCGGCCG 

1141 CGGGCCGCCG CCCAGCCCAA CGGGTCCTAC GTGCAGGTCA ACCGGTCCGG GACCCACAGC 

GCCCGGCGGC GGGTCGGGTT GCCCAGGATG CACGTCCAGT TGGCCAGGCC CTGGGTGTCG 
1201 GTGTGCCTCA ACGGGCCCTC CGGTGCGGAC TTCGACCTCT ACGTGCAGCG CTGGAACGGC 

CACACGGAGT TGCCCGGGAG GCCACGCCTG AAGCTGGAGA TGCACGTCGC GACCTTGCCG 
45 1261 AGCTCCTGGG TGACCGTCGC CCAGAGCACC TCCCCCGGCT CCAACGAGAC CATCACCTAC 

TCGAGGACCC ACTGGCAGCG GGTCTCGTGG AGGGGGCCGA GGTTGCTCTG GTAGTGGATG 
1321 CGCGGCAACG CCGGCTACTA CCGCTACGTG GTCAACGCCG CGTCCGGCTC CGGTGCCTAC 

GCGCCGTTGC GGCCGATGAT GGCGATGCAC CAGTTGCGGC GCAGGCCGAG GCCACGGATG 
1381 ACCATGGGGC TCACCCTCCC CTGA (SEQ ID NO: 3) 

50 TGGTACCCCG AGTGGGAGGG GACT 1 



The following DNA sequence (SEQ ID NO:4) encodes the mature protease derived 
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from Cellulomonas strain 69B4 (DSM 16035). 

1 TTCGACGTGA TCGGCGGCAA CGCCTACACC ATCGGGGGGC GCAGCCGCTG CTCGATCGGG 

AAGCTGCACT AGCCGCCGTT GCGGATGTGG TAGCCCCCCG CGTCGGCGAC GAGCTAGCCC 
5 61 TTCGCGGTCA ACGGCGGGTT CATCACCGCC GGCCACTGCG GCCGCACCGG CGCCACCACC 

AAGCGCCAGT TGCCGCCCAA GTAGTGGCGG CCGGTGACGC CGGCGTGGCC GCGGTGGTGG 

121 GCCAACCCCA CCGGGACCTT CGCCGGGTCC AGCTTCCCGG GCAACGACTA CGCGTTCGTC 

CGGTTGGGGT GGCCCTGGAA GCGGCCCAGG TCGAAGGGCC CGTTGCTGAT GCGCAAGCAG 

10 181 CGTACCGGGG CCGGCGTGAA CCTGCTGGCC CAGGTCAACA ACTACTCCGG TGGCCGCGTC 

GCATGGCCCC GGCCGCACTT GGACGACCGG GTCCAGTTGT TGATGAGGCC ACCGGCGCAG 
241 CAGGTCGCCG GGCACACCGC GGCCCCCGTC GGCTCGGCCG TGTGCCGGTC CGGGTCGACC 

GTCCAGCGGC CCX3TGTGGCG CCGGGGGCAG CCGAGCCGGC ACACGGCCAG GCCCAGCTGG 
301 ACCGGGTGGC ACTGCGGCAC CATCACTGCG CTCAACTCCT CGGTCACCTA CCCCGAGGGC 
15 TGGCCCACCG TGACGCCGTG GTAGTGACGC GAGTTGAGGA GCCAGTGGAT GGGGCTCCCG 

361 ACCGTCCGCG GCCTGATCOG CACCACCGTC TGCGCCGAGC CCGGCGACTC CGGTGGCTCG 

TGGCAGGCGC CGGACTAGGC GTGGTGGCAG ACGCGGCTCG GGCCGCTGAG GCCACCGAGC 
421 CTGCTCGCCG GCAACCAGGC CCAGGGCGTC ACGTCCGGCG GCTCCGGCAA CTGCCGCACC 

GACGAGCGGC CGTTGGTCCG GGTCCCGCAG TGCAGGCCGC CGAGGCCGTT GACGGCGTGG 
20 4 81 GGTGGCACCA CGTTCTTCCA GCCGGTCAAC CCCATCCTCC AGGCGTACGG CCTGAGGATG 

CCACCGTGGT GCAAGAAGGT CGGCCAGTTG GGGTAGGAGG TCCGCATGCC GGACTCCTAC 
561 ATCACCACGG ACTCGGGCAG CAGCCCG (SEQ ID NO: 4) 
TAGTGGTGCC TGAGCCCGTC GTCGGGC 

25 

The following DNA sequence (SEQ ID NO:5) encodes the signal peptide derived 
from Cellulomonas strain 69B4 (DSM 16035) 

1 ATGACACCAC CACAGTCAC GCGGGCCCTG GCCGTGGCCA CCGCAGCCGC CACACTCCTG 

TACTGTGGTG CGTGTCAGTG CGCCCGGGAC CGGCACCGGT GGCGTCGGCG GTGTGAGGAC 
30 61 GCAGGCGGCA TGGCCGCCCA GGCC (SEQ ID NO: 5) 

CGTCCGCCGT ACCGGCGGGT CCGG 

The following sequence is the amino acid sequence (SEQ ID NO:6) of the signal 
36 sequence and precursor protease derived from Cellulomonas strain 69B4 (DSM 16035), 
including the signal sequence [segments 1a-c] (residues 1-28 [-198 to -171]), an N-terminal 
prosequence [segments 2a-r] (residues 29-198 [-170 to-1]), a mature protease [segments 
3a-t] (residues 199-387 [1-189]), and a C-terminal prosequence [segments 4a-l] (residues 
388-495 [190-398]) encoded by the DNA sequences set forth in SEQ ID NOS:1 , 2, 3 and 4. 
40 The N-terminal sequence of the mature protease amino acid sequence is in bold. 



1 MTPRTVTRAL AVATAAATLL AGGMAAQA NE PAPPGSASAP PRLAEKLDPD 

la lb lc 2a 2b ^ 2c 

51 LLEAMERDLG LDAEEAAATL AFQHDAAETG EALAEELDED FAGTWVEDDV 
2d 2e 2f 2g 2h 

101 LYVATTDEDA VEEVEGEGAT AVTVEHSLAD LEAWKTVLDA ALEGHDDVPT 
2i 2j 2k ~ 21 2m 

151 WYVDVPTNSV WAVKAGAQD VAAGLVEGAD VPSDAVTFVE TDETPRTM FD 
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2n 2o 2p 2q 2r 

3a 



Z U JL 


VllitjJUAX ixi? vjivoK^blbr A VJMbVjFI\LA(jn LbKTGATTAN PTGTFAGSSF 


251 


3b 3c 3d 3e 3f 
PGNDYAFVRT GAGVNLLAQV NNYSGGRVQV AGHTAAPVGS AVCRSGSTTG 


301 


3g 3h 3i 3j 3k 
WHCGTITALN SSVTYPEGTV RGLIRTTVCA EPGDSGGSLL AGNQAQGVTS 


351 


31 3m 3n 3o 3p 
GGSGNCRTGG TTFFQPVNPI LQAYGLRMIT TDSGSSP APA PTSCTGYART 


401 


3q 3r 3s 3t 4a 4b 
FTGTLAAGRA AAQPNGSYVQ VNRSGTHSVC LNGPSGADFD LYVQRWNGSS 


451 


4c 4d 4e 4f 4g 
WVTVAQSTSP GSNETITYRG NAGYYRYWN AASGSGAYTM GLTLP (SEQ ID 


NO: 6) 





15 4h 4i 4j 4k 41 



The following sequence (SEQ ID NO:7) is the amino acid sequence of the precursor 
20 protease derived from Cellulomonas strain 69B4 (DSM 1 6035) ( SEQ ID NO:7). 

1 NEPAPPGSAS APPRLAEKLD PDLLEAMERD . LGLDAEEAAA . TLAFQHDAAE 
51 * TGEALAEELD EDFAGTWVED DVLYVATTDE DAVEEVEGEG ATAVTVEHSL 

101 ADLEAWKTVL DAALEGHDDV PTWYVDVPTN SVWAVKAGA QDVAAGLVEG 

25 151 ADVPSDAVTF VETDETPRTM FDVTGGNAYT IGGRSRCSIG FAVNGGFITA 

201 GHCGRTGATT ANPTGTFAGS SFPGNDYAFV RTGAGVNLLA QVNNYSGGRV 

251 QVAGHTAAPV GSAVCRSGST TGWHCGTITA IiNSSVTYPEG TVRGLIRTTV 

301 CAEPGDSGGS LLAGNQAQGV TSGGSGNCRT GGTTFFQPVN PILQAYGLRM 

351 ITTDSGSSPA PAPTSCTGYA RTFTGTLAAG RAAAQPNGSY VQVNRSGTHS 

30 401 VCLNGPSGAD FDLYVQRWNG SSWVTVAQST SPGSNETITY RGNAGYYRYV 

451 VNAASGSGAY TMGLTLP (SEQ ID NO: 7) 



35 The following sequence (SEQ ID NO:8).is the amino acid sequence of the mature 

protease derived from Cellulomonas strain 69B4 (DSM 16035). The catalytic triad residues 
H32, D56 and S132 are bolded and underlined. 

1 FDVIGGNAYT IGGRSRCSIG FAVNGGFITA GHCGRTGATT ANPTGTFAGS 
40 51 SFPGNDYAFV RTGAGVNLLA QVNNYSGGRV QVAGHTAAPV GSAVCRSGST 

101 TGWHCGTITA LNSSVTYPEG TVRGLIRTTV CAEPGDSGGS LLAGNQAQGV 
151 TSGGSGNCRT GGTTFFQPVN PILQAYGLRM ITTDSGSSP (SEQ ID NO: 8) 

45 

The following sequence (SEQ ID NO:9) is the amino acid sequence of the signal 
peptide of the protease derived from Cellulomonas strain 69B4 (DSM 16035). 
» 

1 MTPRTVTRAL AVATAAATLL AGGMAAQA (SEQ ID NO:9) 
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The following sequence (SEQ ID NO: 10) is the degenerate primer used to identify a 
177 bp fragment of the protease of Cellulomonas strain 69B4. 

TTGWXCGT_FW: 5' ACNACSGGSTGGCRGTGCGGCAC 3' (SEQ ID NO:10) 

The following sequence (SEQ ID NO:1 1) is the reverse primer used to identity a 177 
bp fragment of the protease derived from Cellulomonas strain 69B4. 

GDSGGXJW: . S'-ANGNGCCGCCGGAGTCNCC-S' (SEQIDNO:11) 

The following DNA (SEQ ID NO:13) and amino acid sequence of the 177 bp 
fragment (SEQ ID NO: 12) encoding part of the protease gene derived from Cellulomonas 
strain 69B4. The sequences of the degenerate primers (SEQ ID NOS:10 and 11) are 
underlined and in bold. 

DGW DCG TITA LNS SVT YPEG- 
1 ACGACGGCTG GGACTGCGGC ACC ATCACTG CGCTCAACTC CTCGGTCACC TACCCCGAGG 

TGCTGCCGAC CCTGACGCCG TGGTAGTGAC GCGAGTTGAG GAGCCAGTGG ATGGGGCTCC 

• TVR GLI RTTV CAE PGD SGGS- 
61 GCACCGTCCG CGGCCTGATC CGCACCACCG TCTGCGCCGA GCCCGGCGAC TCCGGTGGCT 

CGTGGCAGGC GCCGGACTAG GCGTGGTGGC AGACGCGGCT CGGGCCGCTG AGGCCACCGA 

• LLA GNQ AQGV TSG DSG GS 
121 CGCTGCTCGC CGGCAACCAG GCCCAGGGCG TCACGTCCGG CGACTCCGGC GGCTCAT 

GCGACGAGCG GCCGTTGGTC CGGGTCCCGC AGTGCAGGCC GCTGAGGCCG CCGAGTA 

Analysis of the Sequence of Cellulomonas sp. 69B4 Protease 

A saturated sinapinic acid (3,5-dimethoxy-4-hydroxy cinnamic acid^SA") solution in 
a 1:1 v/v acetonitrile ( a ACN")/0.1% formic acid solution was prepared. The resulting mixture 
was vortexed for 60 seconds and then centrifuged for 20 seconds at 14,000 rpm. Then, Spl 
of the matrix supernatant was transferred to a 0.5 ml Eppendorf tube and 1 pi of a 10 
pmole/pl protease 69B4 sample was added to the SA matrix supernatant and vortexed for 5 
seconds. Then, 1 pi of the analyte/matrix solution was transferred onto a sample plate and, 
after being completely dry, analyzed by a Voyager DE-STR (PerSeptive), matrix assisted 
laser desorption/ionization - time of flight (MALDI-TOF) mass spectrophotometer, with the 
following settings: Mode of operation: Linear; Extraction mode: Delayed; Polarity: Positive; 
Accelerating voltage: 25000 V; Extraction delay time: 350 nsec; Acquisition mass range: 
4000- 20000 Da; Number of laser shots: 100/spectrum; and Laser intensity: 2351. The 
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resulting spectrum is provided in Figure 4. 

A tryptic map was produced using methods known in the art (Christianson et a/., 
Anal. Biochem. 223:1 19-29 [1994]), modified as described herein. The protease solution, 
containing 10 - 50 pg protease was diluted 1:1 with chilled water in a 1 .5 ml microtube. 1 .0 

5 N HCI was added to a final concentration of 0.1 N HCI, mixed thoroughly and incubated for 
10 minutes on ice. Then, 50% trichloroacetic acid ("TCA") was added to a final 
concentration of 10% TCA and mixed. The sample was incubated for 10 minutes on ice, 
centrifuged for two minutes and the supernatant discarded. Then, 1 ml of cold 90% acetone 
was added to resuspend the pellet. The resulting sample was then centrifuged for one 

10 minute, the supernatant quickly decanted and remaining liquid was removed by vacuum 
aspiration. The dry pellet was dissolved in 12 pi of 8.0 M urea solution (480 mg urea 
[Roche, catalog # 1685899]) in 0.65 ml of ammonium bicarbonate solution (final 
concentration of bicarbonate: 0.5 M) and incubated for 3-5 minutes at 37°C. The solution 
was slowing diluted with 48 pi of a n-octyl-beta-D-glucopyranoside solution ("o-water") (200 

15 mg of n-octyl-beta-D-glucopyranoside [Ci 4 H 2 80 6 , f.w. 292.4] in 200 ml of water). Then, 2.0 
pi of trypsin (2.5 mg/ml in 1mM HCI) was added and the mixture was incubated for 15 
minutes at 37°C. The proteolytic reaction was quenched with 6 pi of 10% trifluoroacetic acid 
("TFA"). Insoluble material and bubbles were removed from the sample by centrifugation for 
one minute. The tryptic digest was separate by RP-HPLC on 2.1 X 150 mm C-18 column 

20 (5pl particle size, 300 angstroms pore size). The elution gradient was formed from 0.1% 
(v/v) TFA in water and 0.08% (v/v) TFA in acetonitrile at a flow rate of 0.2 ml-min. The 
column compartment was heated to 50'C. Peptide elution was monitored at 215 nm and 
data were collected at 215 nm and 280 nm. The samples were then analyzed on a LCQ 
Advantage mass spectrometer with a Surveyor HPLC (both from Thermo Finnigan). The 

25 LCQ mass spectrophotometer was run with the following settings: Spray voltage: 4.5kV; 
Capillary temperature: 225 g C. Data processing was performed using TurboSEQUEST and 
Xcalibur (ThermoFinnigan). Sequencing of the tryptic digest portions was also performed in 
part by Argo BioAnalytica. 



30 prosequence protease of 495 amino acids (SEQ ID NO:6). The first 28 amino acids were 
predicted to form a signal peptide. The mass of the mature chain of 69B4 protease as 
produced by Cellulomonas strain 69B4 has a molecular weight of 18764 (determined by 
MALDI-TOF). The sequence of the N-terminus of the mature chain was also determined by 
MALDI-TOF analysis and starts with the sequence FDVIGGNAYTIGGR (SEQ ID NO: 17). It 

35 is believed that the 69B4 protease has a unique precursor structure with NH 2 : and COOH 



Analysis of the full sequence of the asp gene revealed that it encodes a 
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terminal pro-sequences, as is known to occur with some other enzymes (e.g., T. aquaticus 
aqualysin I; See e.g., Lee etaL, FEMS Microbiol. Lett., 1:69-74 [1994]; Sakamoto ef a/., 
Biosci. BiotechnoL Biochem., 59:1438-1443 [1995]; Sakamoto etaL, Appl. Microbiol. 
BiotechnoL, 45:94-101 [1996]; Kim etaL, Biochem. Biophys. Res. Commun., 231:535-539 
s [1997]; and Oledzka etaL, Protein Expr. Purific, 29:223-229 [2003]). The predicted 
molecular weight of mature 69B4 protease as provided in SEQ ID NO:8, was 18776.42, 
which corresponds well with the molecular weight of the purified enzyme with proteolytic 
activity isolated from Cellulomonas sp. 69B4 {i.e., 18764). The prediction of the COOH 
terminal pro-sequence in 69B4 protease was also based on an alignment of the 69B4 
10 protease with T. aquaticus aqualysin I, provided below. In this alignment, the amino acid 
sequence of the Cellulomonas 69B4 signal sequence and precursor protease are aligned 
with the signal sequence and precursor protease Aqualysin I of Thermus aquaticus (COOH- 
terminal pro-sequence of Aqualysin I is underlined and in bold). 

Aqualysin I (1) MRKTWWIALFAVLVLGGCQMASRSDPTPTIiAEAFWPKEAPVYG^ 

15 69B4 (1) MTPRTVTRAIiAVATAAATLLAGGMAAQANEP APPG S ASAPPRLAEKLDPD 

Consensus (1) MA A LLA6 A DP P A A PK A D 

51 , 100 

Aqualysin I (47) DPEAI PGRYZ WFKKGKGQS LLQGG I TTLQARLAPQGVWTQAYTGALQG 

69B4 (51) LLE AMERDLGLDAEEAAATLAFQHDAAETGEALAEE LDEDF AQTWVE 

20 Consensus (51) EAI L A A Q LA LFG 

101 150 
Aqualysin I (97) FAAEMAPQALEAFRQ S PDVEFI EADKWRAWATQ S PAFWGLDRI DQRDL P 
69B4 (98) DDVLYYATTDEDAVEEVEGEGATAVTVEH S LADLEAWKTVLDAALEGHDD 
Consensus (101) E DEAVAA LD 

25 151 200 

Aqualysin I (147) L SNS YTYTATGRGVNVYVIDTG IRTTHREFGGRARVG YDALGGNGQDCNG 
69B4 (148) VPTWYVDVPTNS - - WVAVKAG AQDVAAGL VEGADVP SDAVT — FVETDE 
Consensus (151) L YT VIG AV DAL D 

201 250 
30 Aqualysin I (197) HGTHVAGTIGGVTYGVAKAVNL YAVRVLDCNG SG STSGVI AGVDWVTRNH 

69B4 (194) TPRTMFDVIGGNAYTIGGRS RCSIGFAVNGGFITAQHCGRTG 

Consensus (201) M IGG Y IA C A G R 

251 300 
Aqualysin I (247) RRPAVANMSLGGGVSTALDNAVKN S IAAGWY A VAAGNDNANACNY S PAR 

35 69B4 (236) ATTANPTGTFAGSSFPGNDYAFVRTGAG VNIiLAQVNNYSGGR 

Consensus (251) A SAG ADA S AA NAN NYS AR 

301 350 
Aqualysin I (297) VAEALTVGATTS SDARASFSNYGSCVDLF APGAS I P SAWYTSDTATQTLN 
69B4 (278) VQVAGHTAAPVGSAVCRSGSTTGWHCGTIT — ALNSSVTYPEGTVRGLIR 
40 Consensus (301) VAAAS SSG ASYT I 

351 400 
Aqualysin I (347) GTSMATPHVAGVAALYLEQNPSATPASVASAILNGATTGRLSGIGSGSPN 
69B4 (326) TTVCAEPGDSGGSLLAGNQAQGVTSGGSGNaiTGGTTFF 
Consensus (351) T A P AG A L Q T A A G T A 

45 4 01 450 

Aqualysin I (397) RLLYSLLSSGSGSTAPCTSCSYYTGSLSQ PGDYNFQPNGTYYYSP-A 

69B4 (376) LRMITTDS -GSS PAPAPTSCTGYARTFTGTLAAGRAAAQPNGS YVQVNRS 
Consensus (401) L S S GS TSCS Y S SG G QPNGSY A 

451 500 
50 Aqualysin I (443) GTHRAWLRG P AGTDFDLYLWRWDGSRWLTVG S ST G PT SEE SL SY SGT AGY 
69B4 (425) GTHSVC^GPSGADFDLWQRVWGSSWVTVAQSTSPGSNETITYRGNAGY 
Consensus (451) GTH L GPAG DFDLYL RW GS WLTVA ST P S ESISY G AGY 
501 521 
Aqualysin I (493) YLWRIYAYSGSGMYEFWLQRP (SEQ ID NO:644) 

55 69B4 (475) YRYWNAASGSGAYTMGLTLP (SEQ ID NO: 645) 

Consensus (501) Y W I A SGSG Y L P (SEQ ID NO :646) 



60 The sequences of three internal peptides of the purified enzyme from Cellulomonas 
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sp. 69B4 having proteolytic activity were determined by MALDI-TOF analysis. All three 

• ** 

peptides were also identified in the translation product of the isolated asp gene, confirming 
the identification of the correct protease gene (See, SEQ ID NO:1, above). 

5 Percentage Identity Comparison Between Asp and Streptogrisin 

The deduced polypeptide product of the asp gene (mature chain) was used in 
homology analysis with other serine proteases using the BLAST program and settings as 
described in Example 3. The preliminary analyses showed identities of from about 44 - 48% 
(See, Table 4-1, below). Together with analysis of the translated sequence, these results 

10 provided evidence that the asp gene encodes a protease having less than 50% sequence 
identity with the mature chains of Streptogrisin-like serine proteases. An alignment of Asp 
with Streptogrisin A, Streptogrisin B, Streptogrisin C, Streptogrisin D of Streptomyces 
griseus is provided below. In this alignment, the amino acid sequences of Cellulomonas 
69B4 mature protease ("69B4 mature") are aligned with mature proteases amino acid 

is sequences of Streptogrisin C ("Sq - streptogrisinC_mature n ), Streptogrisin B ("Sq - 

streptogrisinBrnature*), Streptogrisin A ("Sq - streptogrisinAmature"), Streptogrisin D ("Sq - 
streptogrisinDmature") and consensus residues. 



20 69B4 mature (1) 

Sg- Strep togri sine mature (1) 

Sg-StreptogrisinBmature ( 1 ) 

Sg-StreptogrisinAmature ( 1 ) 

Sg- StreptogrisinDmature (1) 

25 Consensus (1) 



Asp mature (41) 

Sg-StreptogrisinC mature (49) 

30 Sg-StreptogrisinBmature (48) 

Sg-StreptogrisinAmature { 43 ) 

Sg- StreptogrisinDmature (47) 

Consensus (51) 

35 Asp mature (91) 

Sg-StreptogrisinC mature (99) 

Sg-StreptogrisinBmature (94) 

Sg-StreptogrisinAmature (90) 

Sg- StreptogrisinDmature (97) 

40 Consensus (101) 

Asp mature (140) 

Sg-StreptogrisinC mature (148) 

Sg-StreptogrisinBmature (144) 

45 Sg-StreptogrisinAmature (140) 

Sg- StreptogrisinDmature (147) 

Consensus (151) 

Asp mature (190) 

50 Sg-StreptogrisinC mature (198) 

Sg-StreptogrisinBmature (186) 

Sg-StreptogrisinAmature (182) 

Sg-StreptogrisinDmature (189) 

Consensus (201) 



i 50 

FDVI GGNAYTI GGRSRCS I GFAVN GGF I TAGHCGRTGATT 

ADI RGGD AYYMNG SGRCSVGF S VTRGTQNGFAT AGH CGR VGTTTNG — VN 

- - 1 SGGDAI Y S ST -GRCSLGFNVRSGSTYYFLTAGHCTDGATTWWANSAR 

- - 1 AGGEAITTGG- SRCSLGFNVSVNGVAHALTAGHCTNI SASWS 

- - IAGGDAIWGSG - SRCSLGFNWKGGEPYFLTAGHCTESVTSWSD-TQG 

IAGGDAIY G SRCSLGFNV G YFLTAGHCT GTTW 

51 100 
ANPTGTF AG S S F PGND YAFVRTGAGVNL LAQ VNNY S GGR VQVAGHT AAP V 
QQAQGTFC^STFPGRDIAWVATNANOTPRPLVNGYGRGDVTVAGSTASW 

TTVLGTTSGSSFPNNDYGIVRYTNTTI PKDGTVGG QDITSAANATV 

1 GTRTGTSFPNNDYG I IRH SNPAAADGRVYLYNGS YQDITTAGNAFV 

GSEI GANEG SSFPENDYGIiVKYTSDTAH PSEVNLYDGSTQAI TQAGDATV 
IGT GSSFP NDYGIVRYTA VN Y G Q IT AG A V 

101 150 
GSAVCRSGSTTGWHCGTITALNSSVTYPEG-TVRGLIRTTVCAEPGDSGG 
GASVCRSGSTTGWHOTIQQIOTSVTYPEG-TISGVTRTSVCAEPGDSGG 
GMAVTRRG STTGTH SG SVTALNATVNYGGGDWY GMI RTNVCAEPGDSGG 
GQAVQRSGSTTGLRSGSVTGIJ^TVNYGSSGIWGMIQTNVCAEPGDSGG 
GQAVTRSGSTTQVHDGEVTAI^ATVNYGNGDIVNGLIQTTVCAEPGDSGG 
G AV RSGSTTG H GSVTALNATVNYG G IV GLIRTTVCAEPGDSGG 
151 200 
SLLAGNQAQGVTSGGSGNCRTGGTTFFOPVNPILQAYGLRMITTDSGSSP 
SYISGSQAQGVTSGGSGNCSSGGTTYFQPINPIiLQAYGLTLVTSGGGTPT 

PLYSGTRAIGLTSGGSGNCSSGGTTFFQPVTEALSAYGVSVY 

SLFAGSTALGLTSGGSGNCRTGGTTFYQPVTEALSAYGATVL 

ALFAGDTALGLTSGGSGDCSSGGTTFFQPVPEALAAYGAEIG 

SLFAGS ALGLTSGGSGNCSSGGTTFFQPV EALSAYGLTVI 

201 250 

DP PTT P PTDS PGGTWAVGT AY AAGATVT YGGATYRC LQ AHTAQPGWTPAD 



251 



Asp mature (190) 



(SEQ ID NO: 8) 



• 
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Sg-StreptogrisinC mature (248) VPALWQRV (SEQ ID NO: 639) 

Sg-StreptogrisinBmature (186) (SEQ ID NO:640) 

Sg-StreptogrisinAmature (182) (SEQ ID NO:641) 

Sg-StreptogrisinDmature (189) (SEQ ID NO: 642) 

Consensus (251) (SEQ ID NO: 643) 



Table 4-1. Percentage Identity: Comparison between Cellulomonas sp. 69B4 Protease 
10 Encoded by asp and Other Serine Proteases (identity between the mature chains) 





Streptogrisin A 
S. griseus 


Streptogrisin B 
S. griseus 


Streptogrisin C 
S. griseus 


Streptogrisin D 
S. griseus 


Alpha lytic 
endopeptidase 
Lysobacter 
enzymogenes 


Asp protease 
Cellulomonas sp. 
isolate 69B4 


48% 


45% 


47% 


46% 


44% 



Additionnel protease sequences were also investigated. In these analyses, 
proteases homologous in protein sequence to the mature domain of ASP were searched for 
using BLAST. Those identified were then aligned using the multiple sequence alignment 
is program clustalW. The numbers on the top of the alignment below refer to the amino-acid 
sequence of the mature ASP protease. The numbers at the side of the alignment are 
sequence identifiers, as described at the bottom of the alignment. 



20 Sequence 1 10 20 30 40 

ASP FDVI GGNAYT I GGR S RC S I GF AVN GGFITAGHCGRTGATTANPTG TF 

2 TPLI AGGEAITTGGSRC SLGFNV- SVNGVAHALTAGHCTNI SASWS IGTR 

3 — I AGGEAI YAAGGGRC SLGFNVKS S SGATYALTAGHCTEI ASTWYTNSGQTSL — LGTR 

4 NKLI QGGDAI YAS SWRC SLGFNVRTS SGAEYFLTAGHCTDGAGAWRASSGGTV IGQT 

25 5 NKL I QGGDAI YASSVTOC SLGFNVRTS SGAEYFLTAGHCTDGAGAWRASSGGTV IGQT 

6 TKL I QGGDAI YAS SWRC SLGFNVRS S SGVDYFLTAGHCTDGAGTWYSNSARTTA — IGST 

7 TKLI SGGDAI YS STGRC SLGFNVR SGS -TYYFLTAGHCTDGATTWWANSARTTV — LGTT 

8 VLGGGAI YGGGSRC SAAFNV- TKGGARYFVTAGHCTNI SANWS ASSGGSV VGVR 

9 QREVAGGDAI YGGGSRC SAAFNV- TKNGVR YFLTAGHCTNL S STWS STSGGTS IGVR 

30 10 KPFIAGGDAITGNGGRCSLGFNVTKG-GEPHFLTAGHCTEGISTWSDSSG — QV — I GEN 

11 KPFVAGGDAITGGGGRCSLGFNVTKG-GEPYFITAGHCTESISTWSDSSG — NV — I GEN 

12 TPLIAGGDAIWGSGSRCSLGFNVVKG-GEPYFLTAGHCTESVTSWSDTQGG-SE — IGAN 

13 KTFASGGDAI FGGGARC SLGFNVTAGDG SAAFLTRGHCGGGATMWSDAQGGQP I — ATVD 

1 4 ' KTFASGGDAI FGGGARC SLGFNVTAGDGS PAFLTAGHCGVAADQWSDAQGGQPI — ATVD 
35 15 

1 6 TTRLNGAEPILSTAGRCSAGFNVTDG-TSDFILTAGHCGPTGSVWFGDRPGDGQ — VGRT 

17 ATVQGGDVYYINRS SRC S I GFAVT TGFVSAGHCGGSGASATTSSGEAL GTF 

1 8 ADI RGGDAYYMNGSGRC SVGFSVTRG - TQNGF ATAGHCGRVGTTTNGVNQQAQ GTF 

1 9 YDLRGGEAYYINNS SRC S I GFPITKG - TQQGFATAGHCGRAGSSTTGANRVAQ GTF 

40 20 YDLVGGDAYYIGN- GRC S IGFSVRQG - STPGFVTAGHCGSVGNATTGFNRVSQ GTF 

2 1 YDLVGGDAYYMGG-GRCSVGFSVTQG-STPGFATAGHCGTVGTSTTGYNQAAQ GTF 

22 EDLVGGDAYYIDDQARC S I GF SVTKD - DQEGFATAGHCGDPGATTTGYNEADQ GTF 

2 3 LAAIIGGNPYYFGNYRCSIGFSVRQG- SQTGFATAGHCGSTGTRVS S PSG TV 

2 4 ANIVGGIEYSINNASLC SVGFSVTRG- ATKGFVTAGHCGTVNATARIGGAW GTF 

45 25 AAGTVGGDPYYTGNVRC S I GFSVH GGFVTAGHCGRAG AGVSGWDRS YI GTF 

2 6 VIVFVRDYWGGDALSGCTLAFPVYGG FLTAGHCAVEGKGHILKTEMTGGQ- IGTV 

2 7 DPPLRSGLAIYGTNVRCSSAFMAYSG- S S YYMMTAGHC AEDS SYWEVPTYSYGYQGVGHV 
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50 60 70 80 90 100 

ASP AG S S F PGN - D YAFVRTG AGVNLLAQVNNYSGGR - VQVAGHTAAPVG S AVCRSG STTGWHC 

2 TGT SF PNNDYGI I RH SNPAAA — DGRVYLYNG S YQDI TTAGNAFVGQAVQRSGSTTGLRS 

3 AGTSFPGNDYGLIRHSNASAA- -DGRVYLYNGSYRDITGAGNAWGQTVQRSGSTTGLHS 
5 4 AG S S F PGNDYG I VQYTG S VSRPGTANGVDITRAATPSVGTTVIRDGSTTGTHS 

5 AGSSF PGNDYG I VQYTG S VSR PGTANGVD I TRAAT P SVGTTVI RDG STTGTH S 

6 AGSSF PGNDYG IVRYTG S VSRPGTANGVD I TRAAT P SVGTTVI RDG STTGTH S 

7 SGSSFPNNDYGI VRYTNTT 1 PKDGTVGGQDITS AANATVGMAVTRRGSTTGTH S 

8 EGTS F PTND YGI VRYTDG S S P - - AGTVDLYNG STQD I S S AANAWGQAI KKSG STTKVTS 
10 9 EGT SF PTND YGI VRYTTTTNV — DGRVNLYNGGYQDI ASAADAWGQAIKKSGSTTKVTS 

10 AAS SFPGDDYGLVKYTADVAH — PSQVNLYDGSSQS I SGAAEAAVGMQVTRSGSTTQVHS 

11 AAS SFPDNDYGLVKYTADVDH — PSEVNLYNGS SQAI SGAAEATVGMQVTRSGSTTQVHD 

12 EGSSFPENDYGLVKYTSDTAH- - P S EVNL YDG S TQ AI TQ AGD ATVGQ AVTRSG STTQVHD 

13 QAVFPPEGDFGLVRYDGPSTE- -APSEVDLGDQTLPISGAAEASVGQEVFRMGSTTGLAD 
15 14 QAVFPGEGDFALVRYDDPATE — APSEVDLGDQTLPI SGAAEAAVGQEVFRMGSTTGLAD 

1 6 VAGSFPGDDFSLVEYANGKAGDGADWAVGDGKGVRITGAGEPAVGQRVFRSGSTSGLRD 

17 SGSWPGSADMAYVRTVSGTVLRGYINGYGQGS-FPVSGSSEAAVGASICRSGSTTQVHC 

18 QG STF PGR - DI AWVATNANWTPRPLVNGYGRGD - VTVAG STASWGASVCRSGSTTGWHC 
20 19 QG S I F PGR - DMAWVATNS SWTATPYVLGAGGQN- VQVTGSTAS PVGASVCRSGSTTGWHC 

2 0 RGSWF PGR- DMAWVAVNSNWTPTSLVRNSGSG- - VRVTG STQ ATVG S S I C RSG STTGWRC 

2 1 EESSFPGD-DMAWVSWSDWNTTPTVNEGE VTVSGSTEAAVGASICRSGSTTGWHC 

2 2 QASTF PGK - DMAWVGVN S DWTAT PDVKAEGGEK - 1 QLAG SVEALVGASVCRSGSTTGWHC 

23 AG SYF PGR - DMG WRITS ADTVTPLVNRYNGGT- VTVTG SQEAATGS SVCRSGATTGWRC 

25 24 AARVF PGN - DRAWVSLTS AQTLL PRVANG SSF — VTVRGSTEAAVGAAVCRSGRTTGYQC 

2 5 QGSSFPDN-DYAWSVGSGWWWPVVLGWGWSDQLWGSWAPVGASICRSGSTTHWHC 

2 6 EASQFGDG IDAAWAKNYGDWNGRGRVTHWNGGGGVDI KG SNEAAVGAHMCKSGRTTKWTC 

2 7 ADYTFGYYGDSAIVRVDDPGF WQPRGWVYPSTRITNWDYDWGQYVCKQGSTTGYTC 

30 110 120 130 140 150 

ASP GTITALNSSVTYPEGW-RGLIRTTVCAEPGDSGGSLLAGN-QAQGVTSGGS 

2 GSVTGLNATVNYGSSGIVYGMIQTNVCAEPGDSGGSLF-AGSTALGLTSGGS 

3 GRVTGLNATVNYGGGDIVSGL IQTNVCAE PGDSGGALF - AGSTALGLTSGGS 

4 GRVTALNATVNYGGGD WGGL I QTTVC AE PGD S GG SL YG SNGTAYGLTSGG S 

35 5 GRVTALNATVNYGGGDWGGLIQTTVCAEPGDSGGSLYGSNGTAYGLTSGGS 

6 GRVTALNATVNYGGGDIVSGLI QTTVC AEPGDSGG PL YG SNGTAYGLTSGG S 

7 G S VT ALN ATVNYGGGDWYGMI RTNVC AE PGD S GG PLY - SGTRAIGLTSGGS 

8 GTVTAVNVTVNYGDGP- VYNMGRTTAC S AGGDSGGAHF- AGSVALG I H SGS S 

9 GTVSAVNVTVNYSDGP - VYGMVRTTAC SAGGDSGGAHF - AGSVALG IHSGS S 

40 10 GTVTGLDATVNY GNGD I VNGLI QTDVC AEPGDSGG SLF SGDK- AVGLTSGGS 

1 1 GTVTGLDATVNY GNGDI VNGL I QTDVC AE PGD SGG SLF SGDQ- AIGLTSGGS 

12 GEVTALD ATVNY GNGDI VNGL I QTTVC AE PGD SGG ALF AGDT - ALGLTSGGS 

1 3 GQVLGLDVTVNY PEG - TVTGL I QTDVC AEPGD SGG SLFTRDGLAI RLTSGGT 

14 GQVLGLDATVNYPEG-MVTGLIQTDVCAEPGDSGGSLFTRDGLAIGLTSGGS 

45 15 VDGLIQTDVCAEPGDSGGALFDGDA- AIGLTSGGS-— 

16 GRVTALDATVNYPEG-TVTGLIETDVCAEPGDSGGPMFSEGV-ALGVTSGGS- 

17 GTI GAKGATVNYPQGAV- SGLTRTSVCAEPGDSGGSFYSGS-QAQGVTSGGS- 

18 GTIQQLNTSVTYPEGTI - SGVTRTSVCAEPGDSGGS YI SGS - QAQGVTSGGS 

19 GTVTQLNTSVTYQEGTI-SPVTRTTVCAEPGD SGG SF I SGS -QAQGVTSGGS 

50 20 GTIQQHNTSVTYPQGTI -TGVTRTSACAQPGDSGGSFI SGT- QAQGVTSGGS 

21 GTIQQHNTSVTYPEGTI-TGVTRTSVCAEPGDSGGSYI SGS -QAQGVTSGGS 

2 2 GTI QQHDTSVTYPEGTV- DGLTETTVCAEPGDSGGPFVSGV- QAQGTTSGGS 

23 GTI QSKNQTVRYAEGTV- TGLTRTTACAEGGDSGG PWLTGS - QAQGVTSGGT 

2 4 GTI TAKNVTANYAEGAV- RGLTQGNACMGRGDSGG SWITS AGQAQGVMSGGNVQSNGNNC 

55 25 GTVLAHNETVNYSDGSVVHQLTKTSVCAEGGDSGGSFISGD-QAQGVTSGGW 

2 6 GYLLRKDVSVNYGNGH I -VTLNET SAC ALGGD SGG AYVWND- QAQG ITSGSN 

2 7 GQITETNATVSYPGRTL-TGMTWSTACDAPGDSGSGVYDGSTAHGILSGGPN 

160 170 180 189 



60 ASP GNCRTGGTTFFQPVNPILQAYGLRMITTDSGSSP (SEQ ID NO: 18) 

2 GNCRTGGTTFYQPVTEALSAYGATVL (SEQ ID NO: 19) 

3 GNCRTGGTT (SEQ ID NO: 20) 

4 GNCSSGGTTFFQPVTEALSAYGVSVY - (SEQ ID NO: 21) 

5 GNCSSGGTTFFQPVTEALSAYGVSVY (SEQ ID NO: 22) 

65 6 GNCSSGGTTFFQPVTEALSAYGVSVY (SEQ ID NO: 23) 

7 GNCSSGGTTFFQPVTEALSAYGVSVY (SEQ ID NO: 24) 

8 GCSGTAGSAIHQPVTKALSAYGVTVYL (SEQ ID NO: 25) 



• 
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9 GCTGTNGS AI HQ PVREAL S AYGVNVY (SEQ ID NO: 26) 

10 GDCTSGGTTFFQPVTEALSATGTQIG (SEQ ID NO: 27) 

11 GDCTSGGETFFQPVTEALSATGTQIG (SEQ ID NO: 28) 

12 GDC S SGGTTF FQPVPEALAAYGAE I G (SEQ ID NO: 29) 

13 RDCTSGGETFFQPVTTALAAVGGTLGGEDGGDG- (SEQ ID NO: 30) 

14 GDCTVGGETFFQPVTTALAAVGATLGGEDGGAGA (SEQ ID NO: 31) 

15 GDCSQGGETFFQPVTEALKAYGAQIGGGQGEPPE (SEQ ID NO: 32) 

16 GDCAKGGTTFFQPLPEAMASLGVRLIVPGREGAA (SEQ ID NO: 33) 

17 GDCSRGGTTYFQPVNRILQTYGLTLVTA (SEQ ID NO: 34) 

18 GNCSSGGTTYFQPINPLLQAYGLTLVTSGG — GT (SEQ ID NO: 35) 

19 GDCRTGGETFFQPINALLQNYGLTLKTTGGDDGG (SEQ ID NO: 36) 

20 GNCS I GGTTFHQPVNP I LSQYGLTLVRS (SEQ ID NO: 37) 

21 GNCTSGGTTYHQPINPLLSAYGLDLVTG (SEQ ID NO: 38) 

22 GDCTNGGTTFYQPVNPLLSDFGLTLKTTSA (SEQ ID NO: 39) 

23 GDCRSGGITFFQPINPLLSYFGLQLVTG (SEQ ID NO: 40) 

24 GIPASQRSSLFERLQPILSQYGLSLVTG (SEQ ID NO: 41) 

25 GNC S SGGETWFQPVNE I LNRYGLTLHTA (SEQ ID NO: 42) 

26 -^1DTNNCRSFYQPVNTVIJNKWKLSLVTSTDVTTS (SEQ ID NO: 43) 

27 SGCGMIHEPI SRALADRGVTLLAG " (SEQ ID NO: 44) 



In the above listing, the numbers correspond as follows: 



1 


ASP Protease 


2 


Streptogrisin A {Streptomyces griseus) 


3 


Glutamyl endopeptidase (Streptomyces fradiae) 


4 


Streptogrisin B (Streptomyces lividans) 


5 


. S AM-P20 ( Streptomyces coelicoloi) 


6 


SAM-P20 (Streptomyces albogriseolus) 


7 


Streptogrisin B (Streptomyces griseus) 


8 


Glutamyl endopeptidase II (Streptomyces griseus) 


9 


Glutamyl endopeptidase II (Streptomyces fradiae) 


10 


Streptogrisin D (Streptomyces albogriseolus) 


11 


Streptogrisin D (Streptomyces coelicoloi) 


12 


Streptogrisin D (Streptomyces griseus) 


13 


Subfamily S1 E unassigned peptidase (SalO protein) (Streptomyces lividans) 


14 


Subfamily S1 E unassigned peptidase (SALO protein) (Streptomyces coelicoloi) 


15 


Streptogrisin D (Streptomyces platensis) 


16 


Subfamily S1E unassigned peptidase (3SC5B7.10 protein)(Sfrepto/nyces coelicoloi) 


17 


CHY1 protease (Metarhizium anisopliae) 


18 


Streptogrisin C (Streptomyces griseus) 


19 


Streptogrisin C (SCD40A.16c protein) (Streptomyces coelicoloi) 


20 


Subfamily S1E unassigned peptidase (I) (Streptomyces sp.) 


21 


Subfamily S1 E unassigned peptidase (II) (Streptomyces sp.) 


22 


Subfamily S1E unassigned peptidase (SCF43A.19 protein)(Sfrepfcv7?yces coelicoloi) 


23 


Subfamily S1E unassigned peptidase (Thermobifida fusca; basonym 
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Thermomonospora fusca) 

24 Alpha-lytic endopeptidase {Lysobacter enzymogenes) 

25 Subfamily S1 E unassigned peptidase (SC10G8.13C protein) (Streptomyces 
coelicolor) 

5 26 Yeast-lytic endopeptidase (Rarobacter faecitabidus) 

27 Subfamily S1 E unassigned peptidase (SC10A5.18 protein) {Streptomyces coelicoloi) 



10 EXAMPLES 

Screening for Novel Homologues of 69B4 Protease by PCR 

In this Example, methods used to screen for novel homologues of 69B4 protease are 
described. Bacterial strains of the suborder Micrococcineae, and in particular from the 
family Cellulomonadaceae and Promicromonosporaceae were ordered from the German 

15 culture collection, DSMZ (Braunschweig) and received as freeze dried cultures. Additional 
strains were received from the Belgian Coordinated Collections of Microorganisms, 
BCCM™/LMG (University of Ghent). The freeze-dried ampoules were opened according to 
DSMZ instructions and the material rehydrated with sterile physiological saline (1.5 ml) for 
1h. Well-mixed, rehydrated cell suspensions (300 \iL) were transferred to sterile Eppendorf 

20 tubes for subsequent PCR. 

PCR Methods 

i) Pretreatment of the Samples 

The rehydrated microbial cell suspensions were placed in boiling water bath for 10 
25 min. The suspensions were then centrifuged at 16000 rpm for 5 min. (Sigma 1-15 

centrifuge) to remove cell debris and remaining cells, the clear supernatant fraction serving 
as template for the PCR reaction. 

(ii) PCR Test Conditions 

30 The DNA from these types of bacteria (Actinobacteria) is characteristically highly GC 

rich (typically >55 mol%), so addition of DMSO is a necessity. The chosen concentration 
based on earlier work with the Cellulomonas sp. strain 69B4 was 4% v/v DMSO. 

(iii) PCR Primers (chosen from the following pairs) 

35 
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Prot-int_FW1 5'-TGCGCCGAGCCCGGCGACTC-3' (SEQ ID NO:45) 

Prot-int_RV1 5'-GAGTCGCCGGGCTCGGCGCA-3' (SEQ ID NO:46) 

Prot-int_FW2 5'-TTCCCCGGCAACGACTACGCGTGGGT-3' (SEQ ID NO:47) 

Prot-int_RV2 5'-ACCCACGCGTAGTCGTTGCCGGGGAA-3' (SEQ ID NO:48) 

Cellu-FW1 5'-GCCGCTGCTCGATCGGGTTC-3' (SEQ ID NO:49) 

Cellu-RV1 5'-GCAGTTGCCGGAGCCGCCGGACGT-3' (SEQ ID NO:50) 



(iv) PCR Mixture (all materials supplied by Invitrogen) 

Template DNA 4ul 

10x PCR buffer 5ul 

50mM MgS04 2ul 

lOmMdNTP's 1ul 

Primers (10^iM soln.) luleach . . 

Platinum Taghifi polymerase 0.5u I 

DMSO 2ul 

MilliQ water 33.5ul 

(v) PCR Protocol 

1) 94°C 5min 

2) 94°C 30 sec 

3) 55°C 30 sec 

4) 68°C 3 min 

5) Repeat steps 2-4 repeat for 29 cycles 

6) 68°C 10 min 

7) 15°C 1 min 

The amplified PCR products were examined by agarose gel electrophoresis. Distinct 
bands for each organism were excised from the gel, purified using the Qiagen gel extraction 
kit, and sequenced by BaseClear, using the same primer combinations. 

(vi) Sequence Analysis 

Nucleotide sequence data were analyzed and the DNA sequences were translated 
into amino acid sequences to review the homology to 69B4-mature protein. Sequence 
alignments were performed using AlignX, a component of Vector NTI suite 9.0.0. The 
results are compiled in Table 5-1. The numbering is that used in SEQ ID NO:8. 



Table 5-1. Percent Identity of (translated) Amino Acid Sequences found 
in Natural Isolate Strains Compared to 69B4 Mature Protease 
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Microorganism 


No. of 
Amino 
Acids 


VsVcriap 
Position 


% Identity 


Cellulomonas flavigena DSM 20109 


101 


34-134 


62 


Cellulomonas biazotea DSM 201 12 


114 


26-139 


68 


Cellulomonas fimi DSM201 1 3 


109 


32 - 140 


72 


Cellulomonas gelida DSM 201 1 1 


48 


142-189 


69 


Cellulomonas iranensis DSM 14785 


85 


52 - 123 


66 


Cellulomonas cellasea DSM 20109 


102 


32-133 


63 


Cellulomonas xylanilytica LMG 21723 


143 


16- 158 


73 


Oerskovia turbata DSM 20577 


111 


34-144 


74 


Oerskovia jenensis DSM 46000 


129 


22-150 


70 


Cellulosimicrobium cellulans DSM 20424 


134 


35-168 


53 


Promicromonospora citrea DSM 431 10 


85 


52-136 


75 | 


Promicromonospora sukumoe DSM 
44121 


85 


52-136 


73 


y\yial llUaUltfl lull 1 Ulllll LIVIO £. 1 / c. 1 


141 


16-156 


64 


Streptomyces griseus ATCC 27001 


No PCR product detected homologous 
to 69B4 protease 


Streptomyces griseus ATCC 1 01 37 


Streptomyces griseus ATCC 23345 


Streptomyces fradiae ATCC 14544 


Streptomyces coelicolor ATCC 10147 


Streptomyces IMdans TK23 



These results show that PCR primers based on polynucleotide sequences of the 
69B4 protease gene (mature chain), SEQ ID NO:4 are successful in detecting homologous 
genes in bacterial strains of the suborder Micrococcineae, and in particular from the family 
Cellulomonadaceae and Promicromonosporaceae. 

Figure 2 provides a phylogeny tree of ASP protease. The phylogeny of this protease 
was examined by a variety of approaches from mature sequences of similar members of the 
chymotrypsin superfamily of proteins and ASP homologues for which significant mature 
sequence has been deduced. Using protein distance methods known in the art (See e.g., 
Kimura, The Neutral Theory of Molecular Evolution . Cambridge University Press, 
Cambridge, UK [1983]) similar trees were obtained either including or excluding gaps. The 
phylogenetic tree of Figure 2 was constructed from aligned sequences (positions 16 -181 of 
SEQ ID NO:8) using TREECONW v.1.3b (Van de Peer and De Wachter, Comput. Appl. 
Biosci., 10:569 - 570 [1994]) and with tree topology inferred by the Neighbor-Joining 
algorithm (Saitou and Nei, Mol. Biol. Evol., 4:406 - 425 [1987]). As indicated by this tree, the 
data indicate that the ASP series of homologous proteases ( tt cellulomonadins ,, ) forms a 
separate subfamily of proteins. In Figure 2, the numbers provided in brackets correspond to 
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the sequences provided herein. 

The following is an alignment between the Cellulomonas 69B4 ASP protease and 
homologous proteases of related genera described herein. 
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69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas find 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm . sukumoe 
69B4 (ASP) mature 
Consensus 



69B4 ( ASP ) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas find 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm . sukumoe 
69B4 (ASP) mature 
Consensus 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas find 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm. sukumoe 
69B4 (ASP) mature 
Consensus 



69B4 (asp) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas find 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
O.jenenensis revi 
Cm. cellulans 
Pm. citrea 
Pm. sukumoe 
69B4 (ASP) mature 
Consensus 



1 50 

(1) MT PRTVTRALAVATAAATLLAGGMAAQ ANE PAP PG S AS AP PRLAEKLDPD 
(1 ) - - 

(1) - -r 

(1) - 

(!) 

(!) 

(1) 

(!) _ 

( 1 ) MAR SFWRTLATACAATALVAG PAALTANAAT PT PDTPTVS PQTS SKVS PE 

(!) 

(1) 

(D w-w 

(!) 

(!) 

(1) 

51 100 

(51) LLEAMERDLGLDAEEAAATLAFQHDAAETGEALAE E LDEDF - AGTWVEDD 

(1, 

(1) 

(1) — " 

(!) 

(1) ~ 

(1) V 

(1) 

(51) VLRALQRDLGL S AKDATKRLAFQ SDAAS TEDALAD S LDAY AGAWVDP ARN 

(1) 

(!) PRAAGRAARSSGSRASAS 

(1) ; 

(1) 

(1) ~ — " 

(51) 

101 150 

(100) VL YVATTDEDAVE EVEGEG ATAVTVEH S LADLEAWKTVLDAAIjEGHDDVP 

(1) — 

(1) 

(1) KQTASEFVIRLTIGELNLAAANSPLPIGHAWSTAL 

(1) 

(1) — 

( 2 ) GRVRQLPLRGHD VLPARERDPAGLRSASRPGLTRSRRARLDAAGPSARVA 
(1) 

(101) TL YVGVADRAEAKEVRS AGAT PVVVlDHTLABlIjDTOKAAIilXaEL^ P AGVP 

(1) " " " 
(19) TSPGPTSVTASAS SCGRATGRRQRWTFEAIX3TVRAGGKCMDVAWAPRPTA 

(1) " 

(1, - 

(1) " - 

(101) 

151 200 

(150) TWYVDVPTN SVWAVKAGAQHVAAGL VT2GAD V? SDAVTFVETDET PRTMF 
(1) - — 

(1) - v 

(36) GWYVDVTTNTVWNATALAVAQATEIVAAAT^ 

(1) ' V 

(1) ~ 

( 52 ) AWYVDVPTNKL VVE SVG — DTAAAADAVAAAGLPADAVTLATTEAPRTFV 

(l) zz 

(151) SWFVDVTTNQWVNVHDGGRA1JU2LAAAS 

(1) — " 

(69) RRS S SRTARQRG PEVRAQRRGRPRVGAGEQS ASTP PGAHRGTRGAVRAHG 

(1) - 

(1) " " 

(1) - " " P 

(151) 



201 



250 
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69B4 (ASP) complete (200) 

Cellulomonas gelida (1) 

Cellulomonas flavigena (2) 

Cellulomonas biazotea (86) 

C. fimi. revi (2) 

C.iranensis revi (1) 

Cellulomonas cellasea (100) 

C. xylanilytica (1) 

Oerskovia turbata (201) 

Oerskovia jenensis (1) 

Cm. cellulans (119) 

Pm. citrea (1) 

Pm. sukumoe (1) 

69B4 (ASP) mature * (2) 

Consensus (201) 



DVT GGNA YTI GGRSR- 



-CSIGFAVNGGFITAGHCGRTGA- 



-TTA 



69B4 (ASP) complete (240) 

Cellulomonas gelida (1) 

Cellulomonas flavigena (42) 

Cellulomonas biazotea (126) 

Cellulomonas fimi (42) 

Cellulomonas iranensis (1) 

Cellulomonas cellasea (140) 

C. xylanilytica (27) 

Oerskovia turbata (241) 

Oerskovia jenensis (27) 

Cm. cellulans (169) 

Pm. citrea . (1) 

Pm. sukumoe (1) 

69B4 (ASP) mature (42) 

Consensus (251) 



DVIGGNAYYIGSRSR- 
DVTGGNRYRINNTSR- 
DVTGGDAYYIGGRSR- 



-CSIGFAVEGGFVTAGHCGRAGA STS 

-CSVX3FAVSGGFVTAGHCGTTGA TTT 

- C S I G FAVTGGF VTAGH CGRTGA ATT 



DVTGGNAYYINASSR CSVGFAVEGGFVTAGHCGRAGA STS 

r CSIGFAVTGGFVTAGHCGRSGA TTT 

DWGGNAYTMGSGGR CSVGFAVNGGFITAGHCGSVGT RTS 

r C SVGF A VNGGFVTAGH CGTVGT RTS 

DVRGGDRYITRDPGAS SGS ACSI GYAVQGGFVTAGHCGRGGTRRVLTASW 



DVI GGNA YTI GGRSR- 
DVTGG Y I R 



- CSIGFAVNGGFITAGHCGRTGA- 
CSIGFAV GGFVTAGHCGR GA 



-TTA 
TS 



251 300 
NPTGTFAGSSFPGNDYAFVRTGAGVNLLAQVNNYSGGRVQVAGHTAAPVG 

SPSGTFRGS SFPGNDYAWVQVASGNTPRGLVNNHSGGTVRVTGSQQAAVG 
KPSGTFAG S SF PGND YAWVR VASGNT P VGAVNNY SGGTVA VAG STQATVG 
S PSGTFAG S S F PGND YAWVR VAS GNT PVGAVNNYS GGTVA VAG STQ AAVG 

FPGNDYAWVQ VG SGDT PRGLVNNYAGGTVRVTG S QQ AAVG 

S PS GTFRGS S F PGNDYAWVQ VASGNT PRGLVNNH SGGTVRVTG S QQAAVG 
S PSGTFAG S S F PGND Y A WVRAAS GNT P VGAVNR YDG S RVTVAG STDAAVG 
G PGGTFRG SNF PGNDY AWVQ VDAGNTP VG A VNNY S GGRVAVAG ST AAPVG 
GPGGTFRGSSFPGNDYAWVQVDAGNTPVGAVNNYSGGRVAVAGSTAAPVG 
ARMGTVQ AAS F PGHD YAWVRVDAG F S PVPRVNNYAGGTVDVAG S AEAPVG 

F PGND YAWVNTGTDDTLVG AVNNY SGGTVNVAG STRAAVG ' 

F PGND YAWVNVG SDDT P I GA VNNYSGGTVNVAG STQAAVG 

NPTGTFAG S S FPGNDYAFVRTGAGVNIXAQVNNYS GGR VQ VAGHTAAPVG 
P GTF G S S F PGNDYAWVQ VAS GNT P VGAVNNY SGGTV VAG ST AAVG 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas fimi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm . sukumoe 
69B4 (ASP) mature 
Consensus 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas fimi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm . ci trea 
Pm. sukumoe 
69B4 (ASP) mature 
Consensus 



301 350 

(290) SAVCRSGSTTGVraO^ITALNSSVTYPEGTVRGLIRTTVCAEPGDSGGSL 
(1) 

( 92 ) SWCRSGSTTGWRCGYVRAYim'VRYAEGSVSGLIRTSVCAEPGDSGGSL 
(176) ASVCRSGSTTGWRCX5TIQAFNSTVNYAQGSVSGLIRTNVCAEPGDSGGSL 

(92) ATVCRSGSTTGWRCGTI QAFNATVNYAEGSVSGLI RTNVCAEPGDSGGSL 

(41) AWCRSGSTTGTOCGTVQAYNASVRYAEGTVSGLIRTNVCAEPGD 

(190) SYVTRSGSTTGMlC^YVllAYNTTVKYAEGSVSGLIRTSVC^PGDSGGS^ 

(77) AAVCRSG STTAWGCGT I Q SRGAS VTY AQGTVS GLIRTTCVCAE PGDSGGSL 

(291) AS VCRSGSTTGVfflCGTIGAYNTSVTYPQGTVSGLIRTNVCAE PGDSGGSL 
(77) S SVCRSGSTTGWRCGTI AAYNS SVTYPQGTVSGLI RTNVCAEPGDSGGSL 

(219) ASVCRSGATTGWRCGVT EQKNITVNYGNGDVPGLVRG S ACAEGGDSGGSV 

(41) ATVCRSGSTTGWHCGTI QALNASVTYAEGTVSGLIRTNVCAEPGD 

(41) STVCRSG STTGWHCGTI Q AFNASVTYAEGTVSGL I RTNVCAEPGD 

(92) SAVC^SGSTTGWHCGTITALNSSVTYPEGTVllGLIRTTVCAEPGDSGGSL 

(301) ASVCRSGSTTGWRCGTI AYNASV YAEGTVSGLIRTNVCAEPGDSGGSL 

351 400 

(340) LAC^QAQGWSGGSGNC^TGGTTFFQPVNPILQAYGIJIMITT-DSGSSPA 
(1) LAGNQ AQGVT SGG S GNCS SGGTT YFQPVNEALRVYGLTLVT S - DGGGTE - 

( 142 ) VAGTQAQGVTSGGSGNCRYGGTTYFQPVNEILQDQPGPSTTR-AL 

(226) IAG^QAQGLTSGGSGNCTTGGTTYFQPVNEALSAYGLTLVTSSGGGGGGG 

(142) VAG 

(86) 

(240) VAGTQAQGVTSGGSGNOIYGGTTYFQPVNE I LQAYGLRLVLG - HARGGPS 

(127) IAGTQARGVTSGGSGNC 

(341) LAGNQAQGVTSGG SGNCSSGGTTYFQPVNEALGG YGLTLVTSDGGG PSRR 
(127) LAGNQ AQGLTSGGSGNCSSGGTTYFQPVNEALSAYGLTLVTSGGRGNC — 
(269) ISGNQAQGVTSGRINDCSNGGKFLYQPDRRPVARDHGRRVGQRARRARGQ 

(86) ---- ~ 

(86) 

(142) LAGNQAQGVTSGGSGNCRTGGTTFFQPVNPILQAYGIjRMITTDSGSSP-- 
(351) LAGNQAQGVTSGGSGNC GGTTYFQPVN L YGL LV 



69B4 (ASP) complete (389) -PAPTSCTGYARTFTGTLAAGRAAAQPNGSYVQVNRSGTO 

Cellulomonas gelida (49) - PPPTGCQGYARTYQGSVSAGTSVAQPNGS YVTTG -GGTHRVCLSGPAOT 

Cellulomonas flavigena (186) 

Cellulomonas biazotea (276) TTCTGYARTYTGSIJ^RQSAVQPSGSYVTVGSSGTIRVCLDGPSGT 

Cellulomonas fimi (145) 

Cellulomonas iranensis (86) 

Cellulomonas cellasea (289) -PARRAPAPPARA 

C xylanilytica (144) - — — 
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Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm. sukumoe 
69B4 (ASP) mature 
Consensus 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas flmi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm. sukumoe 
69B4 (ASP) mature 
Consensus 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas f imi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm. sukumoe 
69B4 (ASP) mature 
Consensus 



RPGARAMRGPTRAASRPGRRSRSERFVRHDRGRATGCA- 



(391) 

(175) 

(319) VHRRPRVRLQ- 

(86) 

(86) 

(190) 

(401) 



451 500 
DFDLWQRWNGSSWVTVAQSTSPGSNETITYRGNAGYYRYVVNAASGSGA 
DLDLYLQKWNG YSWAS VAQSTS PGATEA VTYTGTAG YYRYWHAYAGSGA 



DFDLYLQKWNGSAW- 



(438) 
(97) 
(186) 
(322) 
(145) 
(86) 
(301) 
(144) 
(429) 
(175) 
(329) 
(86) 
(86) 
(190) 
(451) 



501 

(488) YTMGLTLP 
(147) YTLGATTP 
(186) 



(336) 
(145) 
(86) 
(301) 
(144) 
(429) 
(175) 
(329) 
(86) 
(86) 
(190) 
(501) 



(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ. 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 



ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



:6) 
:60) 
:54) 
;56) 
:58) 
:62) 
64) 
66) 
:68) 
;70) 
:72) 
74) 
76) 
8) 

:647) 



45 



50 



55 



60 



EXAMPLE 6 

Detection of Novel Homologues of 69B4 Protease by Immunoblotting 

In this Example, immunoblotting experiments used to detect homologues of 69B4 
are described. The following organisms were used in these experiments : 

1. Cellulomonas biazotea DSM 201 12 

2. Cellulomonas flavigena DSM 20109 

3. Cellulomonas fimi DSM 201 13 

4. Cellulomonas cellasea DSM 201 1 8 

5. Cellulomonas uda DSM 20107 

6. Cellulomonas gelida DSM 201 1 1 

7. Cellulomonas xylanilytica LMG 21 723 

8. Cellulomonas iranensis DSM 14785 

9. Oerskovia jenensis DSM 46000 

10. Oerskovia turbata DSM 20577 

1 1 . Cellulosimicrobium cellulans DSM 20424 

1 2. Xyianibacterium ulmi LMG21 721 

13. Isoptericola variabilis DSM 10177 

14. Xylanimicrobium pachnodae DSM 12657 
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15. Promicromonospora citrea DSM 431 10 

16. Promicromonospora sukumoe DSM 44121 

1 7. Agromyces ramosus DSM 43045 

5 The strains were first grown on Heart Infusion/skim milk agar plates (72 h, 30°C) to 

confirm strain purity, protease reaction by clearing of the skim milk and to serve as 
inoculum. Bacterial strains were cultivated on Brain Heart Infusion broth supplemented with 
casein (0.8% w/v) in 100/500 Erlenmeyer flasks with baffles at 230 rpm, 30°C for 5 days. 
Microbial growth was checked by microscopy. Supernatants were separated from cells by 

io centrifugation for 30 min at 4766 x g. Further solids were removed by centrifugation at 9500 • 
rpm. Supernatants were concentrated using Vivaspin 20 ml concentrator (Vivascience), 
cutoff 10 kDa, by centrifugation at 4000 x g. Concentrates were stored in aliqupts of 0.5 mL 
at-20°C. 

15 Primary antibody 

The primary antibody (EP034323) for the immunoblotting reaction, prepared by 
Eurogentec (Liege Science Park, Seraing, Belgium) was raised against 2 peptides 
consisting of amino acids 151-164 and 178-189 in the 69B4 mature protease (SEQ ID 
NO:8), namely: 

20 TSGGSGNCRTGGTT (epitope 1; SEQ ID NO:51) and LRMITTDSGSSP (epitope 2; 

SEQ ID NO:52) as shown below in the amino acid sequence of 69B4 mature protease: 

1 FDVTGGNAYT IGGRSRCSIG FAVNGGFITA GHCGRTGATT ANPTGTFAGS 
51 SFPGNDYAFV RTGAGVNLLA QVNNYSGGRV QVAGHTAAPV GSAVCRSGST 
25 101 TGWHCGTITA LNSSVTYPEG TVRGLI RTTV CAEPGDSGG S LLAGNQAQGV 

151 ^S^^fiS? " GGTTFFQPVN VJ.IjQAY< ^^MMW^W (SEQ ID NO: 8) 



Electrophoresis and Immunoblotting 
30 Sample preparation 

1 . Concentrated culture supernatant (50 jiL) 

2. PMSF (1 ^L; 20 mg/ml) 

3. 1MHCI(25jiL) 

4. Nu PAGE LDS sample buffer (25 jiL) (Invitrogen, Carlsbad, CA, USA) 
35 Mixed and heated at 90°C for 10 min. 



Electrophoresis 

SDS-PAGE was performed in duplicate using NuPAGE 10% Bis-Tris gels 
(Invitrogen) with MES-SDS running buffer at 100 v for 5 min. and 200 v constant. Where 
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possible,25 \iL sample were loaded in each slot. One gel of each pair was stained with 
Coomassie Blue and the other gel was used for immunoblotting using the Boehringer 
Mannheim chromogenic Western blotting protocol (Roche). 

Immunoblotting 

The transfer buffer used was Transfer buffer: Tris (0.25M) - glycine (1 .92M) - 
methanol (20% v/v).. The PVDF membrane was pre-wetted by successive moistening in 
methanol, deionized water, and finally transfer buffer. 

The PAGE gel was briefly washed in deionized water and transferred to blotting pads 
soaked in transfer buffer, covered with pre-wetted PVDF membrane and pre-soaked blotting 
pads. Blotting was performed in transfer buffer at 400 mA constant for 2.5-3 h. The 
membrane was briefly washed (2x) in Tris buffered saline (TBS) (0.5M Tris, 0.1 5M NaCI, 
pH7.5). Non-specific antibody binding was prevented by incubating the membrane in 1% v/v 
mouse/rabbit Blocking Reagent (Roche) in maleic acid solution (100 mM maleic acid, 150 
mM NaCI, pH7.5) overnight at 4°C. 

The primary antibody used in these reactions was EP034323 diluted 1:1000. The 
reaction was performed with the Ab diluted in 1% Blocking Solution with a 30 min. action 
time. The membrane was washed 4x 1 0 min. in TBST (TSB + 0.1 % v/v Tween 20). 

The secondary antibody consisted of anti-mouse/anti-rabbit IgG (Roche) 73 \iL in 20 
ml in 1% Blocking Solution with a reaction time of 30 min. The membrane was washed 4x 
15 min. in TBST and the substrate reaction (alkaline phosphatase) performed with BM 
Chromogenic Western Blotting Reagent (Roche) until staining occurred. 

The results of the cross-reactivity with primary polyclonal antibody are shown in 
Table 6-1. 



Table 6-1. Immunoblotting Results 


Strain 


Immuno- 
Blot Result 


Estimated 
Molecular 

Mass 

kDa 


% Sequence 
Identity to 

69B4 Mature 
Protease 


Protease 
Activity 

On Hl- 
Skim Milk 

Agar 


C. flavigena DSM 
20109 


positive 


21 ! 


66 


positive 


C. biazotea DSM 
20112 


negative 




65 


positive 


C.fimi DSM 20112 


negative 




72 


weak + 


C. oelida DSM 20111 


positive 


20 


69 


weak + 
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C. udaDSM 20107 


negative 






weak + 


O. iran&floio UOWl 

14785 


negative 




33 


weak + 


O. CGIlao&a Uo IVI 

20118 


positive 


| 27 


61 


positive 


o. xyianiiyuLa Livid 


negative 




69 


positive 


0. furcate DSM 

cXJDl f 


positive 


18 


73 


positive 


O ienensis DSM 
46000 


positive 


35 


78 


positive 


rpllulans DSM 
20424 


negative 




48 


positive 


P ritrpa DSM 431 1 0 








positive 


P. sukumoe DSM 
44121 


negative 




69 


positive 


X. u/m/LMG21721 


negative 




72 


negative 


/. variabilis DSM 
10177 | 


negative 






positive 


X. pachnodae DSM 
12657 


negative 






weak* 


A ramosus DSM 
43045 


negative 






weak + 



Based on these results, it is clear that the antibody used in these experiments is 
highly specific at detecting homologues with a very high percentage of amino acid sequence 
identity to 69B4 protease. Furthermore, these results indicate that the C-terminal portion of 
the 69B4 mature protease chain is fairly variable especially in the region of the 2-peptide 
epitopes. In these experiments, it was determined that in cases where there were more 
than 2 amino acid differences in this region a negative Western blotting reaction resulted. 



EXAMPLE 7 
Inverse PCR and Genome Walking 

In this Example, experiments conducted to elucidate polynucleotide sequences of 
ASP are described. The microorganisms utilized in these experiments were.: 



1 . Cellulomonas biazotea DSM 201 1 2 

2. Cellulomonas flavigena DSM 201 09 

3. Cellulomonas fimi DSM 201 1 3 

4. Cellulomonas cellasea DSM 201 1 8 

5. Cellulomonas gelida DSM 201 1 1 

6. Cellulomonas iranensis (DSM 14785) 

7. Oerskovia jenensis DSM 46000 
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8. Oerskovia turbata DSM 20577 

9. Cellulosimicrobium cellulans DSM 20424 

1 0. Promicromonospora citrea DSM 431 1 0 

1 1 . Promicromonospora sukumoe DSM 44121 

These bacterial strains were cultivated on Brain Heart Infusion broth or Tryptone 
Soya broth in 100/500 Erlenmeyer flasks with baffles at 230 rpm, 30°C for 2 days. Cells 
were separated from the culture broth by centrifugation for 30 min at 4766 x g. 

Chromosomal DNA was obtained by standard phenol/chloroform extraction method 
known in the art from cells digested by lysozyme/EDTA {See e.g., Sambrook et a/., supra). 
Chromosomal DNA was digested with the restriction enzymes selected from the following 
list: Apa\, BamHI, BssHII, Kpn\ t Nart, Afcol, Nhe\, PviA, SaH or Ssfll. 

The nucleotide and amino acid sequences of these organisms are provided below. 
In these listings, the mature protease is indicated in bold and the signal sequence is 
underlined. 



C. flavigena (DSM 20109) 

1 GTCGACGTCA TCGGGGGCAA CGCGTACTAC ATCGGGTCGC GCTCGCGGTG 
CAGCTGCAGT AGCCCCCGTT GCGCATGATG TAGCCCAGCG CGAGCGCCAC 

51 CTCGATCGGG TTCGCGGTCG AGGGCGGGTT CGTCACCGCG GGGCACTGCG 
GAGCTAGCCC AAGCGCCAGC TCCCGCCCAA GCAGTGGCGC CCCGTGACGC 

101 GGCGCGCGGG CGCGAGCACG TCGTCACCGT CGGGGACCTT CCGCGGCTCG 
CCGCGCGCCC GCGCTCGTGC AGCAGTGGCA GCCCCTGGAA GGCGCCGAGC 

151 TCGTTCCCCG GCAACGACTA CGCGTGGGTC CAGGTCGCCT CGGGCAACAC 
AGCAAGGGGC CGTTGCTGAT GCGCACCCAG GTCCAGCGGA GCCCGTTGTG 

201 GCCGCGCGGG CTGGTGAACA ACCACTCGGG CGGCACGGTG CGCGTCACCG 
CGGCGCGCCC GACCACTTGT TGGTGAGCCC GCCGTGCCAC GCGCAGTGGC 

251 GCTCGCAGCA GGCCGCGGTC GGCTCGTACG TGTGCCGATC GGGCAGCACG 
CGAGCGTCGT CCGGCGCCAG CCGAGCATGC ACACGGCTAG CCCGTCGTGC 

301 ACGGGATGGC GGTGCGGCTA CGTCCGGGCG TACAACACGA CCGTGCGGTA 
TGCCCTACCG CCACGCCGAT GCAGGCCCGC ATGTTGTGCT GGCACGCCAT 

351 CGCGGAGGGC TCGGTCTCGG GCCTCATCCG CACGAGCGTG TGCGCCGAGC 
GCGCCTCCCG AGCCAGAGCC CGGAGTAGGC GTGCTCGCAC ACGCGGCTCG 

401 CGGGCGACTC CGGCGGCTCG CTGGTCGCCG GCACGCAGGC CCAGGGCGTC 
GCCCGCTGAG GCCGCCGAGC GACCAGCGGC CGTGCGTCCG GGTCCCGCAG 

451 ACGTCGGGCG GGTCCGGCAA CTGCCGCTAC GGGGGCACGA CGTACTTCCA 
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TGCAGCCCGC CCAGGCCGTT GACGGCGATG CCCCCGTGCT GCATGAAGGT 

501 GCCCGTGAAC GAGATCCTGC AGGACCAGCC CGGGCCGTCG ACCACGCGTG 
CGGGCACTTG CTCTAGGACG TCCTGGTCGG GCCCGGCAGC TGGTGCGCAC 

551 CCCTA 

GGGAT (SEQ ID NO: 53) 



Cellulomonas flavigena (DSM 20109) 

1 VDVTGGNAYY IGSRSRCSIG FAVEGGFVTA GHCGRAGAST SSPSGTFRGS 

51 SFPGNDYAWV QVASGNTPRG LVNNHSGGTV RVTGSQQAAV GSYVCRSGST 

101 TGWRCGYVRA YNTTVRYAEG SVSGLIRTSV CAEPGDSGGS LVAGTQAQGV 

151 TSGGSGNCRY GGTTYFQPVN EILQDQPGPS TTRAL (SEQ ID NO: 54) 



Cellulomonas biazotea (DSM 201 12) 

1 TAAAACAGAC GGCCAGTGAA TTTGTAATAC GACTCACTAT AGGCGAATTG 
ATTTTGTCTG CCGGTCACTT AAACATTATG CTGAGTGATA TCCGCTTAAC 

51 AATTTAGCGG CCGCGAATTC GCCCTTACCT ATAGGGCACG CGTGGTCGAC 
TTAAATCGCC GGCGCTTAAG CGGGAATGGA TATCCCGTGC GCACCAGCTG 

101 GGCCCTGGGC TGGTACGTCG ACGTCACTAC CAACACGGTC GTCGTCAACG 
CCGGGACCCG ACCATGCAGC TGCAGTGATG GTTGTGCCAG CAGCAGTTGC 

151 CCACCGCCCT CGCCGTGGCC CAGGCGACCG AGATCGTCGC CGCCGCAACG 
GGTGGCGGGA GCGGCACCGG GTCCGCTGGC TCTAGCAGCG GCGGCGTTGC 

201 GTGCCCGCCG ACGCCGTCCG GGTCGTCGAG ACCACCGAGG CGCCCCGCAC 
CACGGGCGGC TGCGGCAGGC CCAGCAGCTC TGGTGGCTCC GCGGGGCGTG 

251 GTTCATCGAC GTCATCGGCG GCAACCGTTA CCGGATCAAC AACACCTCGC 
CAAGTAGCTG CAGTAGCCGC CGTTGGCAAT GGCCTAGTTG TTGTGGAGCG 

301 GCTGCTCGGT CGGCTTCGCC GTCAGCGGCG GCTTCGTCAC CGCCGGGCAC 
CGACGAGCCA GCCGAAGCGG CAGTCGCCGC CGAAGCAGTG GCGGCCCGTG 

351 TGCGGGACGA CCGGCGCGAC CACGACGAAA CCGTCCGGCA CGTTCGCCGG 
ACGCCGTGCT GGCCGCGCTG GTGCTGCTTT GGCAGGCCGT GCAAGCGGCC 

401 CTCGTCGTTC CCCGGCAACG ACTACGCGTG GGTGCGCGTC GCGTCCGGCA 
GAGCAGCAAG GGGCCGTTGC TGATGCGCAC CCACGCGCAG CGCAGGCCGT 

451 ACACCCCGGT CGGCGCCGTG AACAACTACA GCGGCGGCAC CGTGGCCGTC 
TGTGGGGCCA GCCGCGGCAC TTGTTGATGT CGCCGCCGTG GCACCGGCAG 

501 GCCGGCTCGA CGCAGGCGAC CGTCGGTGCG TCCGTCTGCC GCTCCGGCTC 
CGGCCGAGCT GCGTCCGCTG GCAGCCACGC AGGCAGACGG CGAGGCCGAG 

551 CACCACGGGG TGGCGCTGCG GGACGATCCA GGCGTTCAAC TCCACCGTCA 
GTGGTGCCCC ACCGCGACGC CCTGCTAGGT CCGCAAGTTG AGGTGGCAGT 
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ACTACGCGCA GGGCAGCGTC n>n 

TGATGCGCGT CCC.CGCAG ^ ^SSSj 
GAGCCCGGCG ACTCCGGCGG ®CACACGCGG 
CTCGSSCCGC ^ *££S SSSS A0G — 
CCTGACGTCC GGCGGGTCGG GCAAPT ^GGGTCCK 
GGACTGCAGG CCGCCCGCC ^ SSSSSS 
TCCAGCCCGT CAACGAGGC TC SCCGCCC TGCTGCATGA 

AGGTCGGGCA GTTGCTCCGC J^ggC 
TCGTCCGGCO GCGGCGGTGT ^CCGGACTG CGAGCAGTGC 

AGCAGGCCGC CGCCGCCACC SSffi igE* 8 
GACCTACACC GGCTCGCTCG ""W* 
CTGGATGTGG CCGAGCGAGC SSSSSS Sg£SSS 
GCAGCTATGT GACCGTCGGG Tm aGGCGGCAG GTCGGCAGGC 

CGTCGATACA CXGGCAGCCC SSgg 5™GAC 

2S52E ^CGGACT CGA^.„ ^ GA ° G<aGCTC 



1001 CGCGTGGGC (SEQ i D N0 . 5S) 
GCGCACCCG «o.55) 



Cellulomonas biazotea (OSM 201 12* 

30 1 KQTASEFVIR LTIGELNLaa 

151 TPVGAVHNYS GG^S ^I^ 8 ' 51 

301 S^GSSG, IRV - S SEHSSSR 

40 

Cellulomonas fimi (DSM 201 13) 

CACC^CAC AGCCGCCGC ^ Kg5 
4S 51 TTCGATCGGG TTCGCCGTCA rno GGCGAC 

AAGCTAGCCC AAGCGGCAGT G^SI SSSSS 
101 GCCGCaCCGG CGCGGCCACG LCC GTGACGC 

151 AGCTTCCCGG GCAACGACPA ,p,o CGAGC 
TCGAAGGGCC CGTTGCTGAT SSSSSSSS ^ CGGGCAACAC 

CCCAGCGC A GCCCGTTGTG 



cr 
rr 



i 

E 

CO 
r- 
m 

o 

o 

-< 
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oni CCCCGTCGGC GCGGTGAACA ACTACAGdGG CGGCACGGTC GCCGTCGCCG 
201 CGCCACTTGT TGATGTCGCC GCCGTGCCAG CGGCAGCGGC 

9M GCTCGACCCA GGCCGCCGTC GGTGCGACCG TGTGCCGCTC GGGCTCCACC 
cSgSggct CCGGCGGCAG CCACGCTGGC ACACGGCGAG CCCGAGGTGG 

, ni ACCGGCTGGC GGTGCGGCAC CATCCAGGCG TTCAACGCGA CCGTCAACTA 
togcIgXccg ccacgccgtg GTAGGTCCGC AAGTTGCGCT GGCAGTTGAT 

•^51 CGCCGAGGGC AGCGTCTCCG GCCTCATCCG CACGAACGTG TGCGCCGAGC 
gcggScccg TCGCAGAGGC CGGAGTAGGC gtgcttgcac ACGCGGCTCG 

AOl CCGGCGACTC GGGCGGCTCG CTCGTCGCCG GCAACCAGGC GCAGGGCATG 
ggccgctgaS cccgccgagc GAGCAGCGGC CGTTGGTCCG CGTCCCGTAC 

451 ACGTCCGGCG GCTCCGACAA CTGC (SEQIDNO:57) 
TGCAGGCCGC CGAGGCTGTT GACG 



Ce "T °%£Z£% iScSIG FAVTGGFVTA GHCGRTGAAT TSPSGTFAGS 
5 i ZTgZykw RVASGNTPVG AVNNYSGGTV AVAGSTQAAV GATVCRSGST 
0i tS^S fnatvnvaeg svsglirtnv caepgdsggs LVAG (SEQ id 



101 
NO:58) 



Cellulononas de^ cggcgtgacg tcgggcgggt cgggcaactg 

gIgcgIccgt tggtccgcgt cccgcactgc agcccgccca gcccgttgac 

51 ctcgtcgggc gggacgacgt acttccagcc cgtcaacgag gccctccggg 
gIScccg ccctgctgca tgaaggtcgg gcagttgctc cgggaggccc 

-, ni TfTACGGGCT CACGCTCGTG ACCTCTGACG GTGGGGGCAC CGAGCCGCCG 

101 IS^c^gI ctgcgagcac tggagactgc cacccccgtg gctcggcggc 

, c , ^P^rrrr-r GCCAGGGCTA TGCGCGGACC TACCAGGGCA GCGTCTCGGC 
151 GGC^cI ACGCGCCTGG ATGGTCCCGT CGCAGAGCCG 

?ni CGGGACGTCG GTCGCGCAGC CGAACGGTTC GTACGTCACG ACCGGGGGCG 
^TCCAGC CAGCGCGTCG GCTTGCCAAG CATGCAGTGC TGGCCCCCGC 

„, GGACGCACCG GGTGTGCCTG AGCGGACCGG CGGGCACGGA CCTGGACCTG 

251 c?tc?g^c ccacacggac tcgcctggcc gcccgtgcct GGACCTGGAC 

, m TACCTGCAGA AGTGGAACGG GTACTCGTGG GCCAGCGTCG CGCACTCGAC 

301 SSSSSJS TCACCTTGCC catgagcacc cggtcgcagc gcgtcagctg 

«i fTCGCCTGGT GCCACGGAGG CGGTCACGTA CACCGGGACC GCCGGCTACT 
351 SS CGGTGCCTCC GCCAGTGCAT GTGGCCCTGG CGGCCGATGA 

rr-rrCACGCG TACGCGGGTT CGGGGGCGTA CACCCTGGGG 

401 SkS SS I£cgccc« gcccccgcat gtgggacccc 
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CGCTGCTGGG GC 



PCTAJS2004/039066 



Cellulomonas gelida (DSM 20111) 

1 LAGNQAQGVT SGGSGNCSSG GTTYFOPVME ATowm-., 

51 PTGCQGYART YQGSVSAGTS VAOFNO^ ^ TSDGGGTEPP 

101 YLQKWNGYSW ASVAQSTSPG A^™^ SGPA <*TDLDL 

151 ATTP (SEQ ID NO- 60) ATBAVTYTGT AGYYRYVVHA YAGSGAYTLG 



Cellulomonas iranensis (DSM 14785) 

1 sss sss = ssss SS5SS 

101 = SSSS £S= SSSS 
151 SSS SSSS SSSS SSSS SSSS 

201 ssss = ssss ssss 

251 GCGACTC (SEQ ID NO: 61) 
CGCTGAG 

Cellulomonas iranensis (DSM 14785) 

1 FPGHDV*™,, VGSGDTPRGL VKmaco™ VTGS M «VG AYVCRSGSTT 
51 GMRCGTVOAV msVK^ VSGLIRTBVC MSO (SEQ ID NO:62) 

Cellulomonas cellasea (DSM 20118) 

1 S SSSSSSS SSSS SSSSS5S ssss 
51 SSSSS SSSSS SSSS SSSSS 22 

101 SSSS s sss ssssss s 

151 SS35 SSS SSS SSS SSS 
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201 CGGCGACACC GCGGCGGCCG CCGACGCCGT CGCCGCCGCG GGCCTGCCTG 
GCCGCTGTGG CGCCGCCGGC GGCTGCGGCA GCGGCGGCGC CCGGACGGAC 

251 CCGACGCCGT GACGCTCGCG ACCACCGAGG CGCCACGGAC GTTCGTCGAC 
GGCTGCGGCA CTGCGAGCGC TGGTGGCTCC GCGGTGCCTG CAAGCAGCTG 

301 GTCATCGGCG GCAACGCGTA CTACATCAAC GCGAGCAGCC GCTGCTCGGT 
CAGTAGCCGC CGTTGCGCAT GATGTAGTTG CGCTCGTCGG CGACGAGCCA 

351 CGGCTTCGCG GTCGAGGGCG GGTTCGTCAC CGCGGGCCAC TGCGGGCGCG 
GCCGAAGCGC CAGCTCCCGC CCAAGCAGTG GCGCCCGGTG ACGCCCGCGC 

401 CGGGCGCGAG CACGTCGTCA CCGTCGGGGA CCTTCCGCGG CTCGTCGTTC 
GCCCGCGCTC GTGCAGCAGT GGCAGCCCCT GGAAGGCGCC GAGCAGCAAG 

451 CCCGGCAACG ACTACGCGTG GGTCCAGGTC GCCTCGGGCA ACACGCCGCG 
GGGCCGTTGC TGATGCGCAC CCAGGTCCAG CGGAGCCCGT TGTGCGGCGC 

501 CGGGCTGGTG AACAACCACT CGGGCGGCAC GGTGCGCGTC ACCGGCTCGC 
GCCCGACCAC TTGTTGGTGA GCCCGCCGTG CCACGCGCAG TGGCCGAGCG 

551 AGCAGGCCGC GGTCGGCTCG TACGTGTGCC GATCGGGCAG CACGACGGGA 
TCGTCCGGCG CCAGCCGAGC ATGCACACGG CTAGCCCGTC GTGCTGCCCT 

601 TGGCGGTGCG GCTACGTCCG GGCGTACAAC ACGACCGTGC GGTACGCGGA 
ACCGCCACGC CGATGCAGGC CCGCATGTTG TGCTGGCACG CCATGCGCCT 

651 GGGCTCGGTC TCGGGCCTCA TCCGCACGAG CGTGTGCGCC GAGCCGGGCG 
CCCGAGCCAG AGCCCGGAGT AGGCGTGCTC GCACACGCGG CTCGGCCCGC 

701 ACTCCGGCGG CTCGCTGGTC GCCGGCACGC AGGCCCAGGG CGTCACGTCG 
TGAGGCCGCC GAGCGACCAG CGGCCGTGCG TCCGGGTCCC GCAGTGCAGC 

751 GGCGGGTCCG GCAACTGCCG CTACGGGGGC ACGACGTACT TCCAGCCCCT 
CCGCCCAGGC CGTTGACGGC GATGCCCCCG TGCTGCATGA AGGTCGGGCA 

801 GAACGAGATC CTGCAGGCCT ACGGTCTGCG TCTCGTCCTG GGCTGACACG 
CTTGCTCTAG GACGTCCGGA TGCCAGACGC AGAGCAGGAC CCGACTGTGC 

851 CTCGCGGCGG GCCCTCCCCT GCCCGTCGCG CGCCGGCCCC ACCAGCCCGG 
GAGCGCCGCC CGGGAGGGGA CGGGCAGCGC GCGGCCGGGG TGGTCGGGCC 

901 GCCG (SEQ ID NO: 63) 
CGGC 



Cellulomonas cellasea (DSM 20118) 

1 VGRVRQLPLR GHDVLPARER 

51 AAWYVDVPTN KLWESVGDT 

101 VIGGNAYYIN ASSRCSVGFA 

151 PGNDYAWVQV ASGNTPRGLV 

201 WRCGYVRAYN TTVRYAEGSV 



DPAGLRSASR PGLTRSRRAR LDAAGPSARV 
AAAADAVAAA GLPADAVTLA TTEAPRTFVD 
VEGGFVTAGH CGRAGASTSS PSGTFRGSSF 
NNHSGGTVRV TGSQQAAVGS YVCRSGSTTG 
SGLIRTSVCA EPGDSGGSIiV AGTQAQGVTS 
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251 GGSGNCRYGG TTYFQPVNEI LQAYGLRLVL G*HARGGPSP ARRAPAPPAR 
301 A (SEQ ID NO: 64) 



Cellulomonas xylanilytica (LMG 21723) 

1 CGCTGCTCGA TCGGGTTCGC CGTGACGGGC GGCTTCGTGA CCGCCGGCCA 
CTGCGGACGG TCCGGCGCGA CGACGACGTC GCCGAGCGGC ACGTTCGCCG 

GCGACGAGCT AGCCCAAGCG GCACTGCCCG CCGAAGCACT GGCGGCCGGT 
GACGCCTGCC AGGCCGCGCT GCTGCTGCAG CGGCTCGCCG TGCAAGCGGC 

101 GGTCCAGCTT TCCCGGCAAC GACTACGCCT GGGTCCGCGC GGCCTCGGGC 
AACACGCCGG TCGGTGCGGT GAACCGCTAC GACGGGAGCC GGGTGACCGT 

CCAGGTCGAA AGGGCCGTTG CTGATGCGGA CCCAGGCGCG CCGGAGCCCG 
TTGTGCGGCC AGCCACGCCA CTTGGCGATG CTGCCGTCGG CCCACTGGCA 

201 GGCCGGGTCC ACCGACGCGG CCGTCGGTGC CGCGGTCTGC CGGTCGGGGT 
CGACGACCGC GTGGGGCTGC GGCACGATCC AGTCCCGCGG CGCGAGCGTC 

CCGGCCCAGG TGGCTGCGCC GGCAGCCACG GCGCCAGACG GCCAGCCCCA 
GCTGCTGGCG CACCCCGACG CCGTGCTAGG TCAGGGCGCC GCGCTCGCAG 

301 ACGTACGCCC AGGGCACCGT CAGCGGGCTC ATCCGCACCA ACGTGTGCGC 
CGAGCCGGGT GACTCCGGGG GGTCGCTGAT CGCGGGCACC CAGGCGCGGG 

TGCATGCGGG TCCCGTGGCA GTCGCCCGAG TAGGCGTGGT TGCACACGCG 
GCTCGGCCCA CTGAGGCCCC CCAGCGACTA GCGCCCGTGG GTCCGCGCCC 

401 GCGTGACGTC CGGCGGCTCC GGCAACTGC (SEQ ID NO: 65) 
CGCACTGCAG GCCGCCGAGG CCGTTGACG 



Cellulomonas xylanilytica (LMG 21723) 

1 RCSIGPAVTG GFVTAGHCGR SGATTTSPSG TFAGSSFPGN DYAWVRAASG 
51 NTPVGAVNRY DGSRVTVAGS TDAAVGAAVC RSGSTTAWGC GTIQSRGASV 
101 TYAQGTVSGL IRTNVCAEPG DSGGSLIAGT QARGVTSGGS GNC (SEQ ID 
NO:66) 



Oerskovia turbata (DSM 20577) 

1 ATGGCACGAT CATTCTGGAG GACGCTCGCC ACGGCGTGCG CCGCGACGGC 
TACCGTGCTA GTAAGACCTC CTGCGAGCGG TGCCGCACGC GGCGCTGCCG 

51 ACTGGTTGCC GGCCCCGCAG CGCTCACCGC GAACGCCGCG ACGCCCACCC 
TGACCAACGG CCGGGGCGTC GCGAGTGGCG CTTG CGGCGC TGCGGGTGGG 

101 CCGACACCCC GACCGTTTCA CCCCAGACCT CCTCGAAGGT CTCGCCCGAG 
GGCTGTGGGG CTGGCAAAGT GGGGTCTGGA GGAGCTTCCA GAGCGGGCTC 
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151 
5 201 
251 

10 

301 
351 

15 

401 

20 451 
501 

25 

551 
601 

30 

651 

35 701 
751 

40 

801 
851 

45 

901 

so 951 



GTGCTCCGCG CCCTCCAGCG GGACCTGGGG CTGAGCGCCA AGGACGCGAC 
CACGAGGCGC GGGAGGTCGC CCTGGACCCC GACTCGCGGT TCCTGCGCTG 

GAAGCGTCTG GCGTTCCAGT CCGACGCGGC GAGCACCGAG GACGCTCTCG 
CTTCGCAGAC CGCAAGGTCA GGCTGCGCCG CTCGTGGCTC CTGCGAGAGC 

CCGACAGCCT GGACGCCTAC GCGGGCGCCT GGGTCGACCC TGCGAGGAAC 
GGCTGTCGGA CCTGCGGATG CGCCCGCGGA CCCAGCTGGG ACGCTCCTTG 

ACCCTGTACG TCGGCGTCGC CGACAGGGCC GAGGCCAAGG AGGTCCGTTC 
TGGGACATGC AGCCGCAGCG GCTGTCCCGG CTCCGGTTCC TCCAGGCAAG 

GGCCGGAGCG ACCCCCGTGG TCGTCGACCA CACGCTCGCC GAGCTCGACA 
CCGGCCTCGC TGGGGGCACC AGCAGCTGGT GTGCGAGCGG CTCGAGCTGT 

CGTGGAAGGC GGCGCTCGAC GGTGAGCTCA ACGACCCCGC GGGCGTCCCG 
GCACCTTCCG CCGCGAGCTG CCACTCGAGT TGCTGGGGCG CCCGCAGGGC 

AGCTGGTTCG TCGACGTCAC GACCAACCAG GTCGTCGTCA ACGTGCACGA 
TCGACCAAGC AGCTGCAGTG CTGGTTGGTC CAGCAGCAGT TGCACGTGCT 

CGGCGGACGC GCCCTCGCGG AGCTGGCTGC CGCGAGCGCG GGCGTGCCCG 
GCCGCCTGCG CGGGAGCGCC TCGACCGACG GCGCTCGCGC CCGCACGGGC 

CCGACGCCAT CACCTACGTG ACGACGACCG AGGCTCCTCG TCCCCTCGTC 
GGCTGCGGTA GTGGATGCAC TGCTGCTGGC TCCGAGGAGC AGGGGAGCAG 

GACGTGGTGG GCGGCAACGC GTACACCATG GGTTCGGGCG GGCGCTGCTC 
CTGCACCACC CGCCGTTGCG CATGTGGTAC CCAAGCCCGC CCGCGACGAG 

GGTCGGCTTC GCGGTGAACG GGGGCTTCAT CACGGCCGGG CACTGCGGCT 
CCAGCCGAAG CGCCACTTGC CCCCGAAGTA GTGCCGGCCC GTGACGCCGA 

CGGTCGGCAC CCGCACCTCG GGGCCGGGCG GCACGTTCCG GGGGTCGAAC 
GCCAGCCGTG GGCGTGGAGC CCCGGCCCGC CGTGGAAGGC CCCCAGCTTG 

TTCCCCGGCA ACGACTACGC CTGGGTGCAG GTCGACGCGG GTAACACCCC 
AAGGGGCCGT TGCTGATGCG GACCCACGTC CAGCTGCGCC CATTGTGGGG 

GGTCGGCGCG GTCAACAACT ACAGCGGTGG GCGCGTCGCG GTCGCAGGGT 
CCAGCCGCGC CAGTTGTTGA TGTCGCCACC CGCGCAGCGC CAGCGTCCCA 

CGACGGCCGC GCCCGTGGGG GCCTCGGTCT GCCGGTCCGG TTCCACGACG 
GCTGCCGGCG CGGGCACCCC CGGAGCCAGA CGGCCAGGCC AAGGTGCTGC 

GGCTGGCACT GCGGCACCAT CGGCGCGTAC AACACCTCGG TGACGTACCC 
CCGACCGTGA CGCCGTGGTA GCCGCGCATG TTGTGGAGCC ACTGCATGGG 

GCAGGGCACC GTCTCGGGGC TCATCCGCAC GAACGTGTGC GCCGAGCCCG 
CGTCCCGTGG CAGAGCCCCG AGTAGGCGTG CTTGCACACG CGGCTCGGGC 



1001 



GCGACTCGGG CGGCTCGCTC CTCGCGGGCA ACCAGGCGCA GGGCGTGACC 
CGCTGAGCCC GCCGAGCGAG GAGCGCCCGT TGGTCCGCGT CCCGCACTGG 
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1051 TCGGGCGGGT CGGGfaaom,, 

-«« ^sss 2Kg SS s~ 

5 "01 GGTCAACGAG GCCCTCGGGP MMGGTCGO 
GCAGTTGCTC CGgSSS ^ ^ «~ 
1151 0 TCGGG<SCCC<5S *»CGAGCAC TGGAGACTGC 

1201 ACCAGGGCAG CGTCTCGGCr ^1^-CCGATA CGCGCCTGGA 

1251 ACGTCACGAC CGGGGGP^ AGCGCGTCGC TTGCCAAGCA 

Oerskoviaturbata (DSM 20577) 

ioi 1 S^Ssss*^ 

.2222~2S=S2K 2 2SS 

««2E gSSS I""™-"" 2£££ Sf GS ™ 
Oerskoviajenensis (DSM 46000) 

=1 CACTGCGGGA CGGTGGGCAC « "^CG 
— CGCCCX GCCaS 2g"» GCACGTTCCG 

101 CGGGTCGAGC TTCCCCGGCA ACT CCTS ^ 
GCCCAGCTCG AAGGGGCCGT !g£S££ GTCGACGCGG 
151 GGAACACCCC GGTCarw GACCCACGTC CAGGTOCGCc 

201 GTCGCGGGCT CGACGGCCGC Afv TCTCGCC » CC ^CGCAGCGC 

~-«-S2S22SS2sS2=-* 

251 TTCCACGACG GGCTGGCGCT crv °»CC»«GCC 
AAGGTGCTGC CC^ AACAGCTCGG 

301 ==S=SS2S25== 



201 
251 
301 
351 
401 



CD 

rr 

a 

> 
00 
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O 

o 
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CGGCTCGGCC CGCTGAGCCC GCCGAGCGAG GAGCGCCCGT TGGTCCGTGT 

- = SS= SSSSS SSSSSS SS55S 

- SSSS S£S= ™ SSSSS = 

501 ACCTCCGGCG GCAGGGGCAA CTGC (SEQ ID NO: 69) 
TGGAGGCCGC CGTCCCCGTT GACG 

Oers/cov/a/enens/s(DSM 46000) ^ TFRGSSF pgn dyawvqvdag 

i rcsvgfavng gfvtaghcgt vgtrtsgpgg c gtiaaynssv 

XK™ P G OAOG^SGGS GNCSSGGTTY 

15 1 ^SWSA YGLTLVTSGG RGNC (SEQ ID NO:70) 



» ssss SSSS SSSS SSSS ssss 

» SSSSS SSSSS SSSSS SSSS SSSSS 
■ - SSSS SSSSS SSSS SSSSS SSSSS 
■« SSSSS SSSS SSSS SSSSS SSSS 

- S5SSSSSSSSSS5 SSSS SSSSS 

~ ssssssss SSSS SSSS SSSS SSSS 

3" SSSSS ssss ssssss =ss 

- sss ssss ssss sssss ssss 

- SSSS SSSS SSSSS =sss =ss 

501 CTGGGCGCGC ATGGGGACGG TCCAGGCGGC GTCGTTCCCC GGCCACGACT 
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GACCCGCGCG TACCCCTGCC AGGTCCGCCG CAGCAAGGGG CCGGTGCTGA 

551 SSS5SSES S CGTCCC — c 

5 CGGCCCAAGA GGGGGCAGGG CGCCCACTTG 
601 AACTACGCCG GCGGCACCGT nr.p^ 

TOGATGCGGC CGCgSSS SS?*** A<WCG <=™» 

«• w-TGCAGCGG CCGAGCCGGC TCCGCGGGCA 

• ^^=2=555555555 

701 =25 555 555 555 sss 
' 6 751 =5. 555 555 555 555 

20 eCGTCCCGCA GTGCAGCCCG TCCTAGTTGC 

951 GCAAGTGCAT CGACGTCCCC Pfip^Anm 

CGTTCACGTA «ft55 £S (SEQ 10 



30 



Cellulosimicrobium cellulans (DSM 20424) 

si 1 SS S ™ K »« «~ 

151 TAGHCGRGGT RRVLTASWAR MGT^SS^ SSSSS™ """WW ffl 

25i s~ 555 s5 sss 555 3 

^ 301 AHBHGHAVG8 KA^S SST,^,^ £ 

> 



Promicromonospora cltrea (DSM 43110) 

. 1 =5 555 555 55s 555 § 

51 2=3 555 555 555 555 % 
" m == 555 555 555 555 

151 GGCTGGCACT GCGGCACCAT CCAGGCGCTG AACGCGTCGG TCACCTACGC 
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CCGACCGTGA CGCCGTGGTA GGTCCGCGAC TTGCGCAGCC AGTGGATGCG 

201 CGAGGGCACC GTGAGCGGCC TCATCCGCAC CAACGTGTGC GCCGAGCCCG 
GCTCCCGTGG CACTCGCCGG AGTAGGCGTG GTTGCACACG CGGCTCGGGC 

251 GCGACTC (SEQ ID NO: 73) 
CGCTGAG 



Promicromonospora citrea (DSM 43110) 

1 FPGNDYAWVN TGTDDTLVGA VNNYSGGTVN VAGSTRAAVG ATVCRSGSTT 
51 GWHCGTIQAL NASVTYAEGT VSGLIRTNVC AEPGD (SEQ ID NO:74) 



Promicromonospora sukumoe (DSM 44121) 

1 TTCCCCGGCA ACGACTACGC GTGGGTGAAC GTCGGCTCCG ACGACACCCC 
AAGGGGCCGT TGCTGATGCG CACCCACTTG CAGCCGAGGC TGCTGTGGGG 

51 GATCGGTGCG GTCAACAACT ACAGCGGCGG CACCGTGAAC GTCGCGGGCT 
CTAGCCACGC CAGTTGTTGA TGTCGCCGCC GTGGCACTTG CAGCGCCCGA 

101 CGACCCAGGC CGCCGTCGGC TCCACCGTCT GCCGCTCCGG TTCCACGACC 
GCTGGGTCCG GCGGCAGCCG AGGTGGCAGA CGGCGAGGCC AAGGTGCTGG 

151 GGCTGGCACT GCGGCACCAT CCAGGCCTTC AACGCGTCGG TCACCTACGC 
CCGACCGTGA CGCCGTGGTA GGTCCGGAAG TTGCGCAGCC AGTGGATGCG 

201 CGAGGGCACC GTGTCCGGCC TGATCCGCAC CAACGTCTGC GCCGAGCCCG 
GCTCCCGTGG CACAGGCCGG ACTAGGCGTG GTTGCAGACG CGGCTCGGGC 

251 GCGACTC (SEQ ID NO: 75) 
CGCTGAG 



Promicromonospora sukumoe (DSM 44121) 

1 FPGNDYAWVN VGSDDTPIGA VNNYSGGTVN VAGSTQAAVG STVCRSGSTT 
51 GWHCGTIQAF NASVTYAEGT VSGLIRTNVC AEPGD (SEQ ID NO: 76) 



Xylanibacterium ulmi (LMG 21721) 

1 GCCGCTGCTC GATCGGGTTC GCCGTGACGG GCGGCTTCGT GACCGCCGGC 
CGGCGACGAG CTAGCCCAAG CGGCACTGCC CGCCGAAGCA CTGGCGGCCG 

51 CACTGCGGAC GGTCCGGCGC GACGACGACG TCCGCGAGCG GCACGTTCGC 
GTGACGCCTG CCAGGCCGCG CTGCTGCTGC AGGCGCTCGC CGTGCAAGCG 

101 CGGGTCCAGC TTTCCCGGCA ACGACTACGC CTGGGTCCGC GCGGCCTCGG 
GCCCAGGTCG AAAGGGCCGT TGCTGATGCG GACCCAGGCG CGCCGGAGCC 
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151 GAACACGCCG GTCGGTGCGG TGAACCGCTA CGACGGCAGC CGGGTGACCG 
CTTGTGCGGC CAGCCACGCC ACTTGGCGAT GCTGCCGTCG GCCCACTGGC 

201 TGGCCGGGTC CACCGACGCG GCCGTCGGTG CCGCGGTCTG CCGGTCGGGG 
ACCGGCCCAG GTGGCTGCGC CGGCAGCCAC GGCGCCAGAC GGCCAGCCCC 

251 TCGACGACCG CGTGGCGCTG CGGCACGATC CAGTCCCGCG GCGCGACGGT 
AGCTGCTGGC GCACCGCGAC GCCGTGCTAG GTCAGGGCGC CGCGCTGCCA 

301 CACGTACGCC CAGGGCACCG TCAGCGGGCT CATCCGCACC AACGTGTGCG 
GTGCATGCGG GTCCCGTGGC AGTCGCCCGA GTAGGCGTGG TTGCACACGC 



351 CCGAGCCGGG TGACTCCGGG GGGTCGCTGA TCGCGGGCAC CCAGGCGCAG 
GGCTCGGCCC ACTGAGGCCC CCCAGCGACT AGCGCCCGTG GGTCCGCGTC 

401 GGCGTGACGT CCGGCGGCTC CGGCAACTGC (SEQ ID NO: 77) 
CCGCACTGCA GGCCGCCGAG GCCGTTGACG 

Xylanibacterium ulmh (LMG 21 721 ) 

1 RCSIGFAVTG GFVTAGHCGR SGATTTSASG TFAGSSFPGN DYAWVRAASG 
51 NTPVGAVNRY DGSRVTVAGS TDAAVGAAVC RSGSTTAWRC GTIQSRGATV 
101 TYAQGTVSGL IRTNVCAEPG DSGGSLIAGT QAQGVTSGGS G (SEQ ID NO: 78) 



Inverse PCR 

Inverse PCR was used to determine the full-length serine protease genes from 
chromosomal DNA of bacterial strains of the suborder Micrococcineae shown by PCR or 
immunoblotting to be novel homologues of the new Cellulomonas sp. 69B4 protease 
described herein. 

Digested DNA was purified using the PCR purification kit (Qiagen, Catalogue # 
28106), and self-ligated with T4 DNA ligase (Invitrogen) according to the manufacturers' 
instructions. Ligation mixtures were purified with the PCR purification kit (Qiagen) and a 
PCR was performed with primers selected from the following list; 



RV-1 Rest 5' - ACCCACGCGTAGTCGTTGCC - 3' (SEQ ID NO:79) 

RV-1 Cellul 5' - ACCCACGCGTAGTCGTKGCCGGGG - 3' (SEQ ID NO:80) 

RV-2 biaz-fimi 5' - TCGTCGTGGTCGCGCCGG - 3* (SEQ ID NO:81) 

RV-2 cella-flavi 5' - CGACGTGCTCGCGCCCG - 3' (SEQ ID NO:82) 

RV-2 cellul 5' - CGCGCCCAGCTCGCGGTG - 3' (SEQ ID NO:83) 

RV-2 turb 5' - CGGCCCCGAGGTGCGGGTGCCG - 3' (SEQ ID NO:84) 

Fw-1 biaz-fimi 5' - CAGCGTCTCCGGCCTCATCCGC - 3 f (SEQ ID NO:85) 

Fw-1 cella-flavi 5' - CTCGGTCTCGGGCCTCATCCGC - 3' (SEQ ID NO:86) 

Fw-1 cellul 5' - CGACGTTCCCGGCCTCGTGCGC - 3' (SEQ ID NO:87) 

Fw-1 turb 5' - CACCGTCTCGGGGCTCATCCGC - 3' (SEQ ID NO:88) 
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Fw-2 rest 5' - AGCARCGTGTGCGCCGAGCC - 3* (SEQ ID NO:89) 

Fw-2 cellul 5' - GGCAGCGCGTGCGCGGAGGG - 3' (SEQ ID NO:90) 

Fw-1 gelida 5' - GCCGCTGCTCGATCGGGTTC - 3' (SEQ ID N0:91 ) 
Rv-1 gelida 5' - GCAGTTGCCGGAGCCGCCGGACGT - 3\ (SEQ ID NO:92) 

5 

The amplified PCR products were examined by agarose gel electrophoresis (0.8% 
agarose in TBE buffer (Invitrogen)). Distinct bands in the range 1 .3 - 2.2 kbp for each . 
organism were excised from the gel, purified using the Qiagen gel extraction kit and the 
sequence analyzed by BaseClear. Sequence analysis revealed that these DNA fragments 
10 covered some additional parts of protease gene homologues to the Cellulomonas 69B4 
protease gene. 



Genome Walking Using Rapid Amplification of Genomic Ends (RAGE) 

A genome walking methodology (RAGE) known in the art was used to determine the 
15 full-length serine protease genes from chromosomal DNA of bacterial strains of the 

suborder Micrococcineae shown by PGR or immunoblotting to be novel homologues of the 
new Cellulomonas sp. 69B4 protease. RAGE was performed using the Universal 
GenomeWalker™ Kit (BD Biosciences Clontech), some with modifications to the 
manufacturer's protocol (BD Biosciences user manual PT3042-1, Version # PR03300). 
20 Modifications to the manufacturer's protocol included addition of DMSO (3 pL) to the 

reaction mixture in 50 pL total volume due to the high GC content of the template DNA and 
use of Advantage™ - GC Genomic Polymerase Mix (BD Biosciences Clontech) for the PCR 
reactions which were performed as follows; 









PCR 1 


PCR 2 


99°C- 


0.05 sec 








94°C- 


0.25 sec/72°C 


- 3.00 min 


7 cycles 


4 cycles 


94°C- 


0.25 sec/67°C 


- 4.00 min 


39 cycles 


24 cycles 


67°C- 


7.00 min 






15°C- 


1 .00 min 









PCR was performed with primers (Invitrogen, Paisley, UK) selected from the following list 
(listed in 5' to 3' orientation); 

35 RV-1 Rest ACCCACGCGTAGTCGTTGCC (SEQ ID NO:79) 
RV-1 Cellul ACCCACGCGTAGTCGTKGCCGGGG (SEQ ID NO:80) 
RV-2 biaz-fimi TCGTCGTGGTCGCGCCGG (SEQ ID NO:81) 
RV-2 cella-flavi CGACGTGCTCGCGCCCG (SEQ ID NO:82) 
RV-2 cellul CGCGCCCAGCTCGCGGTG (SEQ ID NO:83) 

-to RV-2turb CGGCCCCGAGGTGCGGGTGCCG (SEQIDNO:84) 
Fw-1 biaz-fimi CAGCGTCTCCGGCCTCATCCGC (SEQ ID NO:85) 
Fw-1 cella-flavi CTCGGTCTCGGGCCTCATCCGC (SEQ ID NO:86) 
Fw-1 cellul CGACGTTCCCGGCCTCGTGCGC (SEQ ID NO:87) 
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Fw-1 turb 
Fw-2 rest 
Fw-2 cellul 
Fw-1 gelida 
Rv-1 gelida 
Flavi FW1 
Flavi FW2 
Flavi RV1 
Flavi RV2 
TurbFWI 
Turb FW2 
Cellu RV1 
Cellu RV2 
Cellu FW1 
Cellu FW2 
Cella RV2 
Cella RV1 
Cella FW1 
Cella FW2 
NO:106) 
Gelida RV1 
NO:107) 
Gelida RV2 
Gelida FW1 
N0:109) 
Gelida FW2 
Biazot RV1 
Biazot RV2 
Biazot FW1 
Biazot FW2 



CACCGTCTCGGGGCTCATCCGC (SEQ ID NO:88) 
AGCARCGTGTGCGCCGAGCC (SEQ ID NO:89) 
GGC AG CG CGTG CG CGG AGG G (SEQ ID NO:90) 
GCCGCTGCTCGATCGGGTTC (SEQ ID NO:91) 
GCAGTTGCCGGAGCCGCCGGACGT (SEQ ID NO:92) 
TGCGCCGAGCCCGGCGACTCCGGC (SEQ ID NO:93) 
GGCACGACGTACTTCCAGCCCGTGAAC (SEQ ID NO:94) 
GACCCACGCGTAGTCGTTGCCGGGGAACGACGA (SEQ ID NO:95) 
GAAGGTCCCCGACGGTGACGACGTGCTCGCGCC (SEQ ID NO:96) 
CAGGCGCAGGGCGTGACCTCGGGCGGGTCG (SEQ ID NO:97) 
GGCGGGACGACGTACTTCCAGCCCGTCAA (SEQ ID NO:98) 
C ACCC ACGCGTAGTCGTG GCCG GGG AACG A (SEQ ID NO:99) 
GAAGCCGCCCTGGACGGCGTACCCGATCGAGCA (SEQ ID NO:100) 
TGCGCGGAGGGCGGCGACTCGGGCGGGTCG (SEQ ID NO:101) 
TTCCTCTACC AG CCCGTC AACCCG ATCCTA (SEQ ID NO:102) 
CGCCGCGGGGACGAACCCGCCCTCGACCGCGAA (SEQ ID NO:103) 
CGCGTAGTCGTTGCCGGGGAACGACGAGCC (SEQ ID NO:104) 
GGCCTCATCCGCACGAGCGTGTGCGCCGAG (SEQ ID NO:105) 
ACGTCGGGCGGGTCCGGCAACTGCCGCTACGGGGGC (SEQ ID 

GAGCCCGTACACCCGGAGGGCCTCGTTGACGGGCTGGAA (SEQ ID 

CGTCACGCCCTGCGCCTGGTTGCCCGCGAG (SEQ ID NO:108) 
TCCAGCCCGTCAACGAGGCCCTCCGGGTGTACGGGCTC (SEQ ID 



ACGTCGGTCGCGCAGCCGAACGGTTCGTACGTC (SEQ ID NO:110) 
CGTGGTCGCGCCGGTCGTGCCGCAGTGCCC (SEQ ID NO:111) 
GACGACGACCGTGTTGGTAGTGACGTCGACGTACCA (SEQ ID NO:112) 
TCCACCACGGGGTGGCGCTGCGGGACGATC (SEQ ID NO:113) 
GTGTGCGCCGAGCCCGGCGACTCCGGCGGC (SEQ ID NO:114) 
Turb RV C-mature 

GCTCGGGCCCCCACCGTCAGAGGTCACGAGCGTGAG (SEQ ID 

NO:115) 
Turb FW signal 

ATGGCACGATCATTCTGGAGGACGCTCGCCACGGCG (SEQ ID NO:116) 
Cellu internal FW 

TGCTCGATCGGGTACGCCGTCCAGGGCGGCTTC (SEQ ID NO:117) 
Cellu internal RV 

TAGGATCGGGTTGACGGGCTGGTAGAGGAA (SEQ ID NO:118) 
Biazot Int Fw TGGTACGTCGACGTCACTACCAACACGGTCGTCGTC (SEQ ID NO:1 1 9) 
Biazot Int Rv 5" - GCCGCCGGAGTCGCCGGGCTCGGCGCACAC (SEQ ID NO:120) 
flavi Nterm 5' - GTSGACGTSATCGGSGGSAACGCSTACTAC (SEQ ID NO: 1 21 ) 
flavi Cterm 5' - SGCSGTSGCSGGNGANGA (SEQ ID NO:122) 
fimi Nterm 5' - GTSGAYGTSATCGGCGGCGAYGCSTAC (SEQ ID NO:1 23) 
fimi Cterm 5' - SGASGCGTANCCCTGNCC (SEQ ID NO:1 24) 

The PCR products were subcloned in the pCR4-TOPO TA cloning vector (Invitrogen) 
and transformed to Ecoli Top10 one-shot electrocompetent cells (Invitrogen). The 
transformants were incubated (37°C, 260 rpm, 16 hours) in 2xTY medium with 100 jig/ml 
ampicillin. The isolated plasmid DNA (isolated using the Qiagen Qiaprep pDNA isolation kit) 
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was sequenced by BaseClear. 

Sequence Analysis 

Full length polynucleotide sequences were assembled from PCR fragment 
sequences using the GontigExpress and AlignX programs in Vector NTI suite v. 9.0.0 
(Invitrogen) using the original polynucleotide sequence obtained in Example 4 as template 
and the ASP mature protease and ASP full-length sequence for alignment. The results for 
the polynucleotide sequences are displayed in Table 7-1 and the translated amino acid 
sequences are displayed in Table 7-2. For each of the natural bacterial strains the 
polynucleotide sequences and translated amino acid sequences for each of the homologous 
proteases are provided above. 

Table 7-1 provides comparison information between ASP protease and various other 
sequences obtained from other bacterial strains. Amino acid sequence information for Asp- 
mature-protease homologues is available from 13 species: 

1. Cellulomonas biazotea DSM 20112 

2. Cellulomonas flavigena DSM 201 09 

3. Cellulomonas fimi DSM 201 13 

4. Cellulomonas cellasea DSM 201 1 8 

5. Cellulomonas gelida DSM 201 1 1 

6. Cellulomonas iranensis DSM 1 4784 

7. Cellulomonas xylanilytica LMG 2 1 723 

8. Oerskovia jenensis DSM 46000 

9. Oerskovia turbata DSM 20577 

9. Oerskovia turbata DSM 20577 

10. Cellulosimicrobium cellulans DSM 20424 

1 1. Promicromonospora citrea DSM 431 1 0 

12. Promicromonospora sukumoe DSM 44121 

13. Xylanibacterium ulmi LMG 21721 

Notably, the sequence from Cellulomonas gelida at 48 amino acids is too short for 
useful consensus alignment. Sequence alignment against Asp-mature for the remaining 12 
species are provided herein. To date, complete mature sequence has been determined for 
Oerskovia turbata, Cellulomonas cellasea, Cellulomonas biazotea and Cellulosimicrobium 
cellulans. However, there are some problems and sequence fidelity is not guaranteed for 
the sequence information known to the public, Cellulomonas cellasea protease is clearly 
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homologous to Asp (61 .4% identity). However, the sequencing of 10 independent PCR 
fragments of the C-terminal region all gives a stop codon at position 184, suggesting that 
there is no C-terminal prosequence. In addition, Cellulosimicrobium cellulans is a close 
relative of Cellulomonas and clearly has an Asp homologous protease. However, the 
sequence identity is low, only 47.7%. It contains an insertion of 4 amino acids at position 43- 
44 and it is uncertain where the N-terminus of the protein begins. Nonetheless, the data 
provided here clearly show that there are enzymes homologous to the ASP protease 
described herein. Thus, it is intended that the present invention encompass the ASP 
protease isolated from Cellulomonas strain 69B4, as well as other homologous proteases. 

In this Table, the nucleotide numbering is based on full-length gene of 69B4 
protease (SEQ ID NO:2), where nt 1 - 84 encode the signal peptide, nt 85 - 594 encode the 
N-terminal prosequence, nt 595 - 1 161 encode the mature 69B4 protease, and nt 1 162 - 
1485 encode the C-terminal prosequence. 



Table 7-1. Percent Identity of Homologous Polynucleotide Sequences from 
Natural Isolate Strains Compared with ASP Mature Protease Gene Sequence 


Strain 


Total 
Base Pairs 


Overlap* 


% Identity 
Overlap 
Mature Protease 


69B4 (ASP) Protease 


1485 


1-1485 




Cellulomonas flavigena 
DSM20109 


555 


595-1156 


72.3 


Cellulomonas 
biazotea DSM 201 12 


627 


332-1355 


73.7 


Cellulomonas 
fimi DSM 20113 


474 


595-1068 


78.7 


Cellulomonas 
aelida DSM 20118 


462 


1018-1485 


72.2 


Cellulomonas 
iranensis DSM14784 


257 


748-1004 


75.2 


Cellulomonas 
cellasea DSM 20118 


904 


294-1201 


72.7 


Cellulomonas 
xvlanilytica LMG 21723 


429 


640-1068 


75.1 


Oerskovia 
turbata DSM 20577 


1284 


1-1291 


j 73.1 


Oerskovia 

jenensis DSM 46000 


387 


638-1158 


72.7 


Cellulosimicrobium 
cellulans DSM20424 


984 


251-1199 


63.1 
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Promicromonospora 
c/frea DSM 431 10 


257 


748-1004 


75.9 


Promicromonospora 
sukumoe DSM 44121 


257 


748-1004 


77.4 


Xylanibacterium 
u/m/LMG21721 


430 


638-1068 


77.0 



The following Table (Table 7-2) provides information regarding the translated amino 
acid sequence data in natural isolate strains compared with full-length ASP. 



■ 


fable 7-2. Translated Amino Acid Sequence Data Comparisons 


Strain 


Total 
amino 
acids 


Signal 

peptide 

overlap: 
position 


N-terminal pro 

overlap: 
position 


Mature protease 

overlap: 
position 


C-terminal pro 

overlap: 
position 


69B4 (ASP) 
Protease 


495 


9ft (*\ — 9fi\ 

CO \ \ CO) 


i f u ^y — i yoj 


ioy { iyy — oH7) 


108 (388- 
495) 


Cellulomonas 

flavigena 

DSM20109 


185 






185 (199 - 383) 
id 68.6% 




Cellulomonas 
biazotea DSM 
20112 


335 




84 (104- 198) 
id 35.8% 


189(199-387) 
id 70.4% 
complete 


62 (388-451) 
id 64.1% 


Cellulomonas 
tf/n/DSM 20113 


1/1/1 

144 






144(199 - 342) 
id 74.3% 




Cellulomonas 
gelida DSM 20118 


154 






id 68.8% 


106 (388 - 495) 
id 63.9% 
complete 


Cellulomonas 

iranensis 

DSM14784 


85 






85 (250 - 334) 
id 65.9% 




Cellulomonas 
cellasea DSM 
20118 


301 




98 (99-198) 
id 31.0% 


189(199 - 387) 
id 68.3% 
complete 


13(388-400) 
id 30.8% 


Cellulomonas 
xylanllytlca LMG 
21723 


143 






143 (214-356) 
id 73.4% 




Oerskovia 

turbata DSM 20577 


428 


29(2-30) 
id 43.3% 


171 (31-198) 
id 44.4% 


188(201-389) 
id 73.0% 
complete 


40 (390 - 429) 
id 10.0% 


Oerskovia 
jenensis DSM 
46000 


174 






174(214 - 334) 
id 73:6% 




Cellulosimicrobium 

cellulans 

DSM20424 


328 




117(82-198) 
id 6% 


199 (199-387) 
id 47.7% 
complete 


12(388 - 399) 


Promicromonospora 
cltrea DSM 43110 


85 






85 (250 - 334) 
id 75.3% 




Promicromonospora 
sukumoe DSM 
44121 


85 






85(250-334) 
id 64.7% 




Xylanibacterium 
t//m/LMG21721 


141 






141 (214 - 354) 
id 72.3% 
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These results clearly show that bacterial strains of the suborder Micrococcineae, 
including the families Cellulomonadaceae and Promicromonosporaceae possess genes that 
are homologous with the 69B4 protease. Over the region of the mature 69B protease, the 
gene sequence identities range from about 60%-80%. The amino acid sequences of these 
homologous sequences exhibit about 45%-80% identity with the mature 69B4 protease 
protein. In contrast to the majority of streptogrisin proteases derived from members of the 
suborder Streptomycineae, these 69B4 (Asp) protease homologues from the suborder 
Micrococcineae possess six cysteine residues, which form three disulfide bridges in the 
mature 69B4 protease protein. 

Indeed, in spite of the incomplete sequences provided herein and questions 
regarding fidelity, the present invention provides essential elements of the Asp group of 
proteases and comparisons with streptogrisins. Asp is uniquely Asp is characterized, along 
with Streptogrisin C, as having 3 disulfide bridges. In the following sequence, the Asp 
amino acids are printed in bold and the fully conserved residues are underlined. The active 
site residues are marked with # and double underlined. The cysteine residues are marked 
with * and underlined. The disulfide bonds are located between C17 and C38, C95 and 
C105,andC131 andC!58. 

1 5 8 17 20 25 30 32 

XDV[I,V)GG[N, D] fXol C* S fl. VI G fF. Y1 A V X G G F ft. VI TAG H* 

33 35 40 45 50 55 60 

C'G [Xa] G [XJ T/V [XJ GTF XGSS FPG N D*YA [F, W] V [XJ 

65 72 75 80 

[G, D] [XJ [L, P] [Xd VN [N, R] [Y, H] [S, D] £ [G, S] [R, T] V X V [A, T] G 

85 90 95 100 105 

[H, S] [T, QJXAXVG [S, A] X V C* R S G [S, A] TT [G, A] W [H, R] C'G 

112 115 120 125 

P\ Y] [I, V] pCa] [N, G] X [S, T] V X Y [P, A] [E, Q] G [T, S, D] V [R, S] GL 

130 131 135 137 140 

[I, V] R [T, G] [T, N, S] [V, A] CAE [P, G] GDS'GGS [L, V] [L, V, I] [A. S] 

145 150 155 158 

G [N, T] OA [Q, R] G [V, L]IS G [G, R] [S, I] [G, N] [N, D) £ [X*] G 

162 167 169 189 

G PC] Q P [X 21 ] (SEQ ID NO:125) 
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Table 7-3 (below) indicates the positions where ASP and Streptogrisin C differ: 



Table 7-3. Positions At Wh 


ch ASP and Streptogrisin C Differ 


ASP 
Position 


j ASP 
Amino Acid 


ASP 
Homologs 


Streptogrisin C 
Amino Acid 


22 


A 


R? 


S 


25 


G 


G 


N 


28 


I 


V 


A 


51 


S 


N? 


"T 


55 


N 


H? 


R 


57 


Y 


Y 


I 


65 


G 


D 


N 


74 


N 


R 


G 


76 


S 


D 


G 


77 


G 


G 


R 


79 


R 


T 


D 


88 


A 


A 


S 


122 


V 


V 


I 


125 


L 


L 


V 


126 


I 


V 


T 


141 


L 


V 


Y 


145 


N | 


T 


. S 



EXAMPLE 8 

Mass Spectrometric Sequencing of ASP Homologues 



In this Example, experiments conducted to confirm the DNA-derived sequence as 
well as verify/establish the N-terminal and C-terminal sequences of the mature ASP 
homologues are described. The microorganisms utilized in these experiments were the 
following: 

1. Cellulomonas biazotea DSM 201 12 

2. Cellulomonas flavigena DSM 20109 

3. Cellulomonas fimi DSM 201 1 3 

4. Cellulomonas cellasea DSM 201 1 8 

7. Oerskovia jenensis DSM 46000 

8. Oerskovia turbata DSM 20577 

9. Cellulosimicrobium cellulans DSM 20424 

The micropurified ASP homologues were subjected to mass spectrometry-based 
protein sequencing procedures which consisted of these major steps: micropurification, gel 
electrophoresis, in-gel proteolytic digestion, capillary liquid chromatography electrospray 
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tandem mass spectrometry (nanoLC-ESI-MS/MS), database searching of the mass 
spectrometric data, and de novo sequencing. Details of these steps are described what 
follows. As described previously in Example 6, concentrated culture sample (about 200 ml) 
was added to 500ml 1M CaCI2 and centrifuged at 14,000 rpm (model 541 5C Eppendorf) for 
5 min. The supernatant was cooled on ice and acidified with 200 ml 1 N HCI. After 5 min, 
200 ml 50% trichloroacetic acid were added and the sample was centrifuged for 4 min at 
14,000 rpm (model 541 5C Eppendorf). The supernatant was discarded and the pellet was 
washed first with water and then with 90% acetone. The pellet, after being dried in the 
speed vac, was dissolved in 2X Protein Preparation (Tris-Glycine Sample Buffer; Novex) 
buffer and diluted 1 + 1 with water before being applied to the SDS-PAGE gel. SDS-PAGE 
was run with NuPAGE MES SDS Running Buffer. SDS-PAGE gel (1 mm NuPAGE 10% 
Bis-Tris; Novex) was developed and stained using standard protocols known in the art. 
Following SDS-PAGE, bands corresponding to ASP homologues were excised and 
processed for mass spectrometric peptide sequencing using standard protocols in the art. 

Peptide mapping and sequencing was performed using capillary liquid 
chromatography electrospray tandem mass spectrometry (nanoLC-ESI-MS/MS). This 
analysis-systems-consisted-of - capillajy-HPLC system (model CapLC; Waters) and mass 
spectrometer (model Qtof Ultima API; Waters). Peptides were loaded on a pre-column 
(PepMap100 C18, Sum, 100A, 300um ID x 1mm; Dionex) and chromatographed on capillary 
columns (Biobasic C18 75um x 10cm; New Objectives) using a gradient from 0 to 100% 
solvent B in 45min at a flow rate of 200nUmin (generated using a static split from a pump 
flow rate of 5uL/min). Solvent A consisted of 0.1% formic acid in water; and solvent B was 
0.1% formic acid in acetonitrile. The mass spectrometer was operated with the following 
parameters: spray voltage of 3.1 kV, desolavation zone at 150C, mass spectra acquired 
from 400 to 1900 m/z, resolution of 6000 in v-mode. Tandem MS spectra were acquired in 
data dependent mode with two most intense peaks selected and fragmented with mass 
dependent collision energy (as specified by vendor) and collision gas (argon) at 2.5x10-5 
torr. 

The identities of the peptides were determined using a database search program 
(Mascot, Matrix Science) using a database containing ASP homologue DNA-obtained 
sequences. Database searches were performed with the following parameters: no enzyme 
selected, peptide error of 2.5Da, MS/MS ions error of 0.1 Da, and variable modification of 
carboxyaminomethyl cysteine). For unmatched MS/MS spectra, manual de novo sequence 
assignments were performed. For example, Figure 4 shows the sequence of N-terminal 
most tryptic peptide from C. flavigena determined from this tandem mass spectrum. In 
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Table 8-1, the percentage of the sequence verified on the protein level for various 
homologues are reported along with N-terminal and C-terminal peptide sequences. 



Table 8-1. Mass Spec. Sequencing of ASP Homologues 


ASP 
Homologue 


Sequence 
| Verified 

% 

Trypsin, 
Chymotrypsin 
Digests 


N-terminal 

and 
C-terminal 
Sequences 
(Peptide Mass in Da) 


Cellulomonas 
cellasea 


81,81 


[IY]AWDAFAENWDWSSR (SEQ ID 

NO: 126) (2026.7) 
YGGTTYFQPVNEILQAY (SEQ ID 
NO:127)(1961.8) 


Cellulomonas 
flavigena 


70, 50 


VDVI\LGGNAYYI/L[...]R (SEQ ID 
NO:128)(1697.7) 


Cellulomonas 
fimi 


21, ND 


VDVI/LGGDAY[...]R (SEQ ID NO:129) 
(1697.6) 


Notes: 

ND: not determined 

sequence not determined indicated in [..] 
sequence order not determined indicated by [ ] 
isobaric residues not distinguished indicated by l\L 



5 
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EXAMPLE 9 
Protease Production in Streptomyces IMdans 
This Example describes experiments conducted to develop methods for production 
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of protease by S. lividans. Thus, a plasmid comprising a polypeptide encoding a 
polypeptide having proteolytic activity was constructed and used such vector to transform 
Streptomyces lividans host cells The methods used for this transformation are more fully 
described in US Patent No. 6,287,839 and WO 02/50245, both of which are herein 
expressly incorporated by reference. 

One plasmid developed during these experiments was designated as tt pSEG69B4T." 
The construction of this plasmid made use of one pSEGCT plasmid vector (See, WO 
02/50245). A glucose isomerase ( tt Gi n ) promoter operably linked to the structural gene 
encoding the 69B4 protease was used to drive the expression of the protease. A fusion 
between the Gl-promoter and the 69B4 signal-sequence, N-terminal prosequence and 
mature sequence was constructed by fusion-PCR techniques as a Xbal-BamH\ fragment. 
The fragment was ligated into plasmid pSEGCT digested with Xba\ and EamHI, resulting in 
plasmid pSEG69B4T (See, Figure 6). Although the present Specification provides specific 
expression vectors, it is contemplated that additional vectors utilizing different promoters 
and/or signal sequences combined with various prosequences of the 69B4 protease will find 
use in the present invention. 

An additional plasmid developed during the experiments was designated as 
u pSEA469B4Cr (See, Figure 7). As with the pSEG69B4T plasmid, one pSEGCT plasmid 
vector was used to construct this plasmid. To create the pSEA469B4CT, the Aspergillus 
niger (regulatory sequence) ( W A4 M ) promoter was operably linked to the structural gene 
encoding the 69B4 protease, and used to drive the expression of the protease. A fusion 
between the A4-promoter and the Cel A (from Streptomyces coelicolot) signal-sequence, 
the asp-N-terminal prosequence and the asp mature sequence was constructed by fusion- 
PCR techniques, as a Xba\-BamH\ fragment. The fragment was ligated into plasmid 
pSEMGCT digested with Xba\ and SamHI, resulting in plasmid pSEA469B4CT (See, 
Figure 7). The sequence of the A4 (A nigei) promoter region is: 



1 




TCGAA 


CTTCAT 


GTTCGA 


GTTCTT 


GTTCAC 


GTAGAA 


GCCGGA 


GATGTG AGAGGT 






AGCTT 


GAAGTA 


CAAGCT 


CAAGAA 


CAAGTG 


CATCTT 


CGGCCT 


CTACAC TCTCCA 


61 


GATCTG 


GAACTG 


CTCACC 


CTCGTT 


GGTGGT 


GACCTG 


GAGGTA 


AAGCAA 


GTGACC CTTCTG 




CTAGAC 


CTTGAC 


GAGTGG 


GAGCAA 


CCACCA 


CTGGAC 


CTCCAT 


TTCGTT 


CACTGG GAAGAC 


121 


GCGGAG 


GTGGTA 


AGGAAC 


GGGGTT 


CCACGG 


GGAGAG 


AGAGAT 


GGCCTT 


GACGGT CTTGGG 




CGCCTC 


CACCAT 


TCCTTG 


CCCCAA 


GGTGCC 


CCTCTC 


TCTCTA 


CCGGAA 


CTGCCA GAACCC 


181 


AAGGGG 


AGCTTC 


NGCGCG 


GGGGAG 


GATGGT 


CTTGAG 


AGAGGG 


GGAGCT 


AGTAAT GTCGTA 




TTCCCC 


TCGAAG 


NCGCGC 


CCCCTC 


CTACCA 


GAACTC 


TCTCCC 


CCTCGA 


TCATTA CAGCAT 


241 


CTTGGA 


CAGGGA 


GTGCTC 


CTTCTC 


CGACGC 


ATCAGC 


CACCTC 


AGCGGA GATGGC ATCGTG 




GAACCT 


GTCCCT 


CACGAG 


GAAGAG 


GCTGCG 


TAGTCG 


GTGGAG 


TCGCCT 


CTACCG TAGCAC 


301 


CAGAGA 


CAGACC 


















GTCTCT 


GTCTGG 


(SEQ ID NO: 130) 
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In these experiments, the host Streptomyces lividansTK23 was transformed with 
either of the vectors described above using protoplast methods known in the art (See e.g., 
Hopwood, et al.,. Genetic Manipulation of Streptomyces. A Laboratory Manual . The John 
Innes Foundation, Norwich, United Kingdom [1985]). 

s The transformed culture was expanded to provide two fermentation cultures. At 

various time points, samples of the fermentation broths were removed for analysis. For the 
purposes of this experiment, a skimmed milk procedure was used to confirm successful 
cloning. In these methods, 30 pi of the shake flask supernatant was spotted in punched out 
holes in skim milk agar plates and incubated at 37*C. The incubated plates were visually 

10 reviewed after overnight incubation for the presence of halos. For purposes of this 

experiment, the same samples were also assayed for protease activity and for molecular 
weight (SDS-PAGE). At the end of the fermentation run, full length protease was observed 
by SDS-PAGE. 

A sample of the fermentation broth was assayed as follows: 10pl of the diluted 
15 supernatant was taken and added to 190 pi AAPF substrate solution (cone. 1 mg/ml, in 0.1 
M Tris/0.005% TWEEN, pH 8.6). The rate of increase in absorbance at 410 nm due to 
release of p-nitroaniline was monitored (25°C). The assay results of the fermentation broth 
of 3 clones (X, Y, W) obtained using the pSEG69B4T and two clones using the 
pSEA469B4T indicated that Asp was expressed by both constructs, able XXI. Results for 
20 Two Clones (pSEA469B4T). Indeed, the results obtained in these experiments showed that 
the polynucleotide encoding a polypeptide having proteolytic activity was expressed in 
Streptomyces lividans, using both of these expression vectors. Although two vectors are 
described in this Example, it is contemplated that additional expression vectors using 
different promoters and/or signal sequences combined with different combinations of 69B4 
25 protease: + / - N terminal and C terminal prosequence in the pSEA4CT backbone (vector), 
as well as other constructs will find use in the present invention. 

EXAMPLE 10 

30 Protease Production in R subtilis 

In this Example, experiments conducted to produce protease 69B4 (also referred to 
herein as "ASP," "Asp," and "ASP protease," and "Asp protease") in B. subtilis are 
described. In this Example, the transformation of plasmid pHPLT-ASP-C1-2 (See, Table 
10-1; and Figure 9), into B. subtilis is described. Transformation was performed as known 

35 in the art (See e.g., WO 02/14490, incorporated herein by reference. To optimize ASP 
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expression in B. subtilis a synthetic DNA sequence was produced by DNA2.0, and utilized in 
these expression experiments. The DNA sequence (synthetic ASP DNA sequence) 
provided below, with codon usage adapted for Bacillus species, encodes the wild type ASP 
precursor protein: 



ATGACACCACGAACTGTCACAAGAGCTCTGGCTGTGGCAACAGCAGCTGCTACACTCTTGGCTGGGGGTAT 
GGnAGCACAAGC TAACGAACCGGCTCCTCCAGGATCTGCATCAGCCCCTCCACGATTAGCTGAAAAACTTGA 
CCCTGACTTACTTGAAGCAATGGAACGCGATCTGGGGTTAGATGCAGAGGAAGCAGCTGCMCGTTAGCTTT 
TCAGCATGACGCAGCTGAAACGGGAGAGGCTCTTGCTGAGGAACTCGACGAAGATTTCGCGGGCACGTGGG 

10 TTGAAGATGATGTGCTGTATGTTGCAACCACTGATGAAGATGCTGTTGAAGAAGTCGAAGGCGAAGGAGCAA 
CTGCTGTGACTGTTGAGCATTCTCTTGCTGATTTAGAGGCGTGGAAGACGGTTTTGGATGCTGCGCTGG 
GTCATGATGATGTGCCTACGTGGTACGTCGACGTGCCTACGAATTCGGTAGTCGTTGCTGTAAAGGCAGGAG 
CGCAGGATGTAGCTGCAGGACTTGTGGAAGGCGCTGATGTGCCATCAGATGCGGTCACTTTTGTAGAAACG 
GACGAAACGCCTAGMCGATG TTCGACGTAATTGGAGGCAACGCATATACTATTGGCGGCCGGTCTAGATG 

15 TTCTATCGGATTCGCAGTAAACGGTGGCTTCATTACTGCCGGTCACTGCGGAAGAACAGGAGCCACTACTG 
CCAATCCGACTGGCACATTTGCAGGTAGCTCGTTTCCGGGAAATGATTATGCATTCGTCCGAACAGGGGCA 
GGAGTAAATTTGCTTGCCCAAGTCAATAACTACTCGGGCGGCAGAGTCCAAGTAGCAGGACATACGGCCG 
CACCAGTTGGATCTGCTGTATGCCGCTCAGGTAGCACTACAGGTTGGCATTGCGGAACTATCACGGCGCT 
GAATTCGTCTGTCACGTATCCAGAGGGAACAGTCCGAGGACTTATCCGCACGACGGTTTGTGCCGAACCA 

20 GGTGATAGCGGAGGTAGCCTTTTAGCGGGAAATCAAGCCCAAGGTGTCACGTCAGGTGGTTCTGGAAATT 
GTCGGACGGGGGGAACAACATTCTTTCAACCAGTCAACCCGATTTTGCAGGCTTACGGCCTGAGAATGATT 
ACGACTGACTCTGGAAGTTCCCC TGCTCCAGCACCTACATCATGTACAGGCTACGCAAGAACGTTCACAGG 
AACCCTCGCAGCAGGAAGAGCAGCAGCTCAACCGAACGGTAGCTATGTTCAGGTCAACCGGAGCGGTACAC 
ATTCCGTCTGTCTCAATGGACCTAGCGGTGCGGACTTTGATTTGTATGTGCAGCGATGGAATGGCAGTAGCT 

25 GGGTAACCGTCGCTCAATCGACATCGCCGGGAAGCAATGAAACCATTACGTACCGCGGAAATGCTGGATATT 
ATCGCTACGTGGTTAACGCTGCGTCAGGATCAGGAGCTTACACAATGGGACTCACCCTCCCCTGA (SEQ ID 
NO:131) 

In the above sequence, bold indicates the DNA that encodes the mature protease, 
standard font indicates the leader sequence, and the underline indicates the N-terminal and 
30 C-terminal prosequences. 

Expression of the Synthetic ASP Gene 

Asp expression cassettes were constructed in the pXX-Kpnl (See, Figure 15) or 
P2JM103-DNNDPI (See, Figure 16) vectors and subsequently cloned into the pHPLT vector 
35 (See, Figure 17) for expression of ASP in B. subtilis. pXX-Kpnl is a pUC based vector with 
the aprE promoter (S. subtilis) driving expression, a cafgene, and a- duplicate aprE promoter 
for amplification of the copy number in B. subtilis. The bla gene allows selective growth in £ 
coli. The Kpnl, introduced in the ribosomal binding site, downstream of the aprE promoter 
region, together with the HinM site enables cloning of Asp expression cassettes in pXX- 
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Kpnl. The vector p2JM103-DNNDPI contains the aprE promoter (S. subtilis) to drive 
expression of the BCE103 cellulase core (endo-cellulase from an obligatory alkaliphilic 
Bacillus; See, Shaw et a/., J. Mol. Biol., 320:303-309 [2002]), in frame with an acid labile 
linker (DDNDPI [SEQ ID NO: 132]; See, Segalas et a/., FEBS Lett., 371:171-175 [1995]). 
s The ASP expression cassette (SamHI and HindlW) was fused to BCE103-DDNDPI fusion 
protein. When secreted, ASP is cleaved of the cellulase core to turn into the mature 
protease 

pHPLT (See, Figure 17; and Solingen et al., Extremophiles 5:333-341 [2001]) 
contains the thermostable amylase LAT promoter (Put) of Bacillus licheniformis, followed by 
10 Xba\ and Hpal restriction sites for cloning ASP expression constructs. The following 
sequence is that of the BCE103 cellulase core with DNNDPI acid labile linker, in this 
sequence, the bold indicates the acid-labile linker, while the standard font indicates the 
BCE103core. 



15 VRSKKLWISLLF ALTLIFTM 

1 GTGAGA AGCAAA AAATTG TGGATC AGCTTG TTGTTT GCGTTA ACGTTA ATCTTT ACGATG 
CACTCT TCGTTT TTTAAC ACCTAG TCGAAC AACAAA CGCAAT TGCAAT TAGAAA TGCTAC 
A F SN MS AQ A D DY SVVE EH GQ 
61 GCGTTC AGCAAC ATGAGC GCGCAG GCTGAT GATTAT TCAGTT GTAGAG GAACAT GGGCAA 
20 CGCAAG TCGTTG TACTCG CGCGTC CGACTA CTAATA AGTCAA CATCTC CTTGTA CCCGTT 

L S IS NG EL VN E R GE QV Q L KG 
121 CTAAGT ATTAGT AACGGT GAATTA GTCAAT GAACGA GGCGAA CAAGTT CAGTTA AAAGGG 
GATTCA TAATCA TTGCCA CTTAAT CAGTTA CTTGCT CCGCTT GTTCAA GTCAAT TTTCCC 
MS SHGLQWYGQFVNYE SMKW 
25 181 ATGAGT TCCCAT GGTTTG CAATGG TACGGT CAATTT GTAAAC TATGAA AGCATG AAATGG 

TACTCA AGGGTA CCAAAC GTTACC ATGCCA GTTAAA CATTTG ATACTT TCGTAC TTTACC 
L R DD WG IT VF R A AM YT SS GG 
241 CTAAGA GATGAT TGGGGA ATAACT GTATTC CGAGCA GCAATG TATACC TCTTCA GGAGGA 
GATTCT CTACTA ACCCCT TATTGA CATAAG GCTCGT CGTTAC ATATGG AGAAGT CGTCCT 
30 YI DD PSVK EKVK ETV E A A I D 

301 TATATT GACGAT CCATCA GTAAAG GAAAAA GTAAAA GAGACT GTTGAG GCTGCG ATAGAC 
ATATAA CTGCTA GGTAGT CATTTC CTTTTT CATTTT CTCTGA CAACTC CGACGC TATCTG 
L G I Y VI ID WH IL SD ND .PN IY 
361 CTTGGC ATATAT GTGATC ATTGAT TGGCAT ATCCTT TCAGAC AATGAC CCGAAT ATATAT 
35 GAACCG TATATA CACTAG TAACTA ACCGTA TAGGAA AGTCTG TTACTG GGCTTA TATATA 

KE EA KD FF DE MS EL YG DY PN 
421 AAAGAA GAAGCG AAGGAT TTCTTT GATGAA ATGTCA GAGTTG TATGGA GACTAT CCGAAT 
TTTCTT CTTCGC TTCCTA AAGAAA CTACTT TACAGT CTCAAC ATACCT CTGATA GGCTTA 
VI YE IANE PN GS DV TW DN QI 
40 4 81 GTGATA TACGAA ATTGCA AATGAA CCGAAT GGTAGT GATGTT ACGTGG GACAAT CAAATA 

CACTAT ATGCTT TAACGT TTACTT GGCTTA CCATCA CTACAA TGCACC CTGTTA GTTTAT 
KP Y A EE V I PV IR D N D P N N IV 
541 AAACCG TATGCA GAAGAA GTGATT CCGGTT ATTCGT GACAAT GACCCT AATAAC ATTGTT 
TTTGGC ATACGT CTTCTT CACTAA GGCCAA TAAGCA CTGTTA CTGGGA TTATTG TAACAA 
45 IVGTGTWSQDVHHAADNQLA 
601 ATTGTA GGTACA GGTACA TGGAGT CAGGAT GTCCAT CATGCA GCCGAT AATCAG CTTGCA 
TAACAT CCATGT CCATGT ACCTCA GTCCTA CAGGTA GTACGT CGGCTA TTAGTC GAACGT 
DP NV MY AF HF YA GT HG Q N L R 
661 GATCCT AACGTC ATGTAT GCATTT CATTTT TATGCA GGAACA CATGGA CAAAAT TTACGA 
50 CTAGGA TTGCAG TACATA CGTAAA GTAAAA ATACGT CCTTGT GTACCT GTTTTA AATGCT 

DQVDYALDQGAA IFVS EWGT 
721 GACCAA GTAGAT TATGCA TTAGAT CAAGGA GCAGCG ATATTT GTTAGT GAATGG GGGACA 
CTGGTT CATCTA ATACGT AATCTA GTTCCT CGTCGC TATAAA CAATCA CTTACC CCCTGT 
SA AT GDGG VF LD E A Q V WI DF 
55 7 81 AGTGCA GCTACA GGTGAT GGTGGT GTGTTT TTAGAT GAAGCA CAAGTG TGGATT GACTTT 
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purified primers gave far better results in terms of incorporation of full length primers as well 
as significant reduction in primer-containing errors. However, in these experiments, purified 
primers were not used, probably resulting in the production of 12% of clones had undesired 
mutations. 



Table 16-1. Primers and Sequences 


Primer name 


Primer sequence 


ASPR14L 

ASPR16Q 

ASPR35F 

ASPR61S 

ASPR79T 

ASPR123L 

ASPR127Q 

ASPR159Q 

ASPR179Q 


gcatatactattggcggcctgtctagatgttctatcgga (SEQ ID NO:595) 
actattggcggccggtctcagtgttctatcggattcgc (SEQ ID NO:596) 
ctgccggtcactgcggatttacaggagccactactgc (SEQ ID NO:597) 
atgattatgcattcgtctcaacaggggcaggagtaaat (SEQ ID N0.598) 
ataactactcgggcggcacagtccaagtagcaggacatac (SEQ ID NO:599) 
atccagagggaacagtcctgggacttatccgcacgac (SEQ ID NO:600) 
cagtccgaggacttatccagacgacggtttgtgccgaac (SEQ ID NO:601) 
gtggttctggaaattgtcagacggggggaacaacattc (SEQ ID NO:602) 
tqcaqqcttacqqcctqcaqatgattacqactgactc (SEQ ID NO:603) 


ASPC17S 

ASPC33S 

ASPC95S 

ASPC105S 

ASPC131S 

ASPC158S 


ttggcggccggtctagatcatctatcggattcgcagta (SEQ ID NO:604) 
tcattactgccggtcactcaggaagaacaggagccact (SEQ ID NO:605) 
cagttggatctgctgtatctcgctcaggtagcactac (SEQ ID NO:606) 
cactacaggttggcattcaggaactatcacggcgctg (SEQ ID NO:607) 
cttatccgcacgacggtttcagccgaaccaggtgatag (SEQ ID NO-.608) 
caqatqqttctqqaaattcacqgacqqqqqqaacaac (SEQ ID NO:609) 


ASPSEQF1 
ASPSEQF4 
ASPSEQR4 


tgcctcacatttgtgccac (SEQ ID NO:610) 
caggatgtagctgcaggac (SEQ ID NO:611) 
ctcqqttatgagttaqttc (SEQ ID NO:612) 



pHPLT-ASP-C1-2 Plasmld Preparation and In vitro Methylation 

To construct the cysteine and arginine libraries using the QCMS kit, the template 
plasmid pHPLT-ASP-C1-2 was first methylated in vitro since it was derived from a Bacillus 
strain that does not methylate DNA at GATC sites. This method was used because the 
more common approach of ensuring methylation in plasmids used in the QCMS protocol 
involving deriving DNA from dam+ E. coll strains was not an option here, because the 
plasmid pHPLT-ASP-C1-2 does not grown in E. coli. 

Miniprep DNA was prepared from Bacillus cells harboring the pHPLT-ASP-C1-2 
plasmid. Specifically, the strain was grown overnight in 5 mL of LB withlOppm of neomycin, 
after which the cells were spun down. The Qiagen spin miniprep DNA kit was used for 
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preparing the plasmid DNA with an additional step wherein 100uL of 10mg/mL lysozyme 
was added after the addition of 250uL of P1 buffer from the kit. The sample was incubated 
at 37'C for 15 min with shaking, after which the remaining steps outlined in the Qiagen 
miniprep kit manual were carried out. The miniprep DNA was eluted with 30uL of Qiagen 
s buffer EB provided in the kit. 

Next, the pHPLT-ASP-C1-2 plasmid DNA was methylated in vitro using a dam 
methylase kit from NEB (NEB catalog # M0222S). Briefly, 25pL of the miniprep DNA (about 
1-2 pg) was incubated with 20pL of the 10x NEB dam methylase buffer, 0.5pL of S- 
adenosylmethionine (80pM), 4pL of the dam methylase and 150.5pL of sterile distilled 

10 water. The reaction was incubated at 37°C for 4 hours, after which the DNA was purified 
using a Qiagen PCR purification kit. The methylated DNA was eluted with 40pL of buffer EB 
provided in the kit. To confirm methylation of the DNA, 4pL of the purified, methylated DNA 
was digested with Mbol (NEB; this enzyme cuts unmethylated GATC sites) or Dpnl (Roche; 
this enzyme cuts methylated GATC sites) in a 20pL reaction using 2pL of each enzyme. 

15 The reactions were incubated at 37*C for 2 hours and they were analyzed on a 1 .2% E-gel 
. (Invitrogen). A small molecular weight DNA smear/ladder was observed for the Dpnl digest, 
whereas the MbcA digest showed intact DNA, which indicated that the pHPLT-ASP-C1-2 
plasmid was successfully methylated. 

20 Library Construction 

The cysteine (cys) and arginine (arg) combinatorial libraries were constructed as 
outlined in the Stratagene QCMS kit, with the exception of the primer concentration used in 
the reactions. Specifically, 4pL of the methylated, purified pHPLT-ASP-C1-2 plasmid (about 
25 to 50ng) was mixed with 15pL of sterile distilled water, 1 .5pL of dNTP, 2.5pL of 10x 

25 buffer, 1 pL of the enzyme blend and 1 .OpL arginine or cysteine mutant primer mix (i.e., for a 
total of100ng of primers). The primer mix was prepared using 10pL of each of the nine 
arginine primers (100ng/pL) or each of the six cysteine primers (100ng/pL); adding 50ng of 
each primer for both the arg and cys libraries as recommended in the Stratagene manual 
resulted in less than 50% of the clones containing mutations in a previous round of 

30 mutagenesis. Thus, the protocol was modified in the present round of mutagenesis to 
include a total of 100ng of primers in each reaction. The cycling conditions were 95"C for 1 
min, followed by 30 cycles of 95'C for 1 min, 55'C for 1 min, and 65'C for 9 min, in an MJ 
Research thermocycler using thin-walled 0.2mL PCR tubes. The reaction product was 
digested with 1pL of Dpnl from the QCMS kit by incubating at 37*C overnight. An additional 

35 0.5mL of Dpnl was added, and the reaction was incubated for 1 hour. 
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To transform the library DNA directly into Bacillus cells with out going through E. coli, 
the library DNA (single-stranded QCMS product) was amplified using the TempliPhi kit 
(Amersham cat. #25-6400), because Bacillus requires double-stranded multimeric DNA for 
transformation. For this purpose, 1uL of the arginine or cysteine QCMS reaction was mixed 
with 5uL of sample buffer from the TempliPhi kit and heated for 3 minutes at 95"C to 
denature the DNA. The reaction was placed on ice to cool for 2 minutes and then spun 
down briefly. Next, 5pL of reaction buffer and 0.2pL of phi29 polymerase from the 
TempliPhi kit were added, and the reactions were incubated at 30*C in an MJ Research 
PCR machine for 4 hours. The phi29 enzyme was heat inactivated in the reactions by 
incubation at 65"C for 1 0 min in the PCR machine. 

For transformation of the libraries into Bacillus, 0.5pL Of the TempliPhi amplification 

reaction product was mixed with 100pL of comK competent cells followed by vigorous 

s 

shaking at 37*C for 1 hour. The transformation was serially diluted up to 10 fold, and 50uL 
of each dilution was plated on LA plates containing 10 ppm neomycin and 1.6% skim milk. 
Twenty-four clones from each library were picked for sequencing. Briefly, the colonies were 
resuspended in 20uL of sterile distilled water and 2pL was then used for PCR with 
ReadyTaq beads (Amersham) in a total volume of 25uL. Primers ASPF1 and ASPR4 were 
added at a concentration of 0.5pM. Cycling conditions were 94'C for 4 min once, followed 
by 30 cycles of 94'C for 1min, 55*C for 1 min, and 72*C for 1min, followed by one round at 
72'C for 7 min. A 1 .5kb fragment was obtained in each case and the product was purified 
using a Qiagen PCR purification kit. The purified PCR products were sequenced with 
ASPF4 and ASPR4 primers. 

A total of 48 clones were sequenced (24 from each library). The mutagenesis 
worked quite well in that only about 15% of the clones were WT. But 20% of the clones had 
mixed sequences because the plate was crowded with colonies or the TempliPhi 
amplification resulted in very concentrated DNA for transformation. Also, as indicated 
above, about 12% of clones had extra mutations. The remaining clones were all mutant, and 
of these about 60-80% were unique mutants. The sequencing results for the arginine and 
cysteine libraries are provided below in Tables 16-2, and 16-3. 



Table 16-2. Arqinfne Library Sequencing and Skim Milk Plate Results 


Colony 


Halo 


R14L 


R16Q 


R35F 


R61S 


R79T 


R123L 


R127Q 


R159Q 


R179Q 


R1 


medium 




X 


X 










X 




R2 


yes 
















X 




R3 


yes 




X 








X 
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n*f 


ycb 




V 

A 








Y 
A 








no 






Y 

A 








Y 
A 








no 


y6S 




Y 
A 








Y 
A 








n/ 


yos 


Y 
A 












Y 
A 


X 




DO 

no 


yes 




Y 
A 








Y 
A 








DQ 

riy 


yes 




















MlU 


yes 


v 

A 














• 


X 


HI 1 


yes 




















OHO 

R12 


medium 




V 
A 


V 
A 










X 




R13 


yes 










X 










R14 


yes 




















R 15 


yes 




















R16 


medium 




















rl I / 


no 








Y 
A 




Y 
A 


A 






R18 


medium 












X 


X 




X 


R19 


medium 




















R20 


yes 


X 












X 


X 




R21 


medium 




X 






X 




X 






R22 


small 




















R23 


yes 




X 






X 










R24 


yes 





















j Table 16-3. Cysteine Library Sequencing and Skim Mil 


k Plate Results 


Colony 


Halo? 


C17S 


C33S 


C95S 


C105S 


C131S 


C158S 


C1 


no 


X 


X 










C2 


no 














C3 


yes 














C4 


yes 














C5 


no 


X 




X 








C6 


small 


X 






X 






C7 


no 






X 


X 


X 




C8 


yes 














C9 


no 














C10 


no 














C11 


small 














C12 


no 














C13 


no 


X 




X 








C14 


no 


X 


X 


X 






X 


C15 


no 














C16 


no 












X 


C17 


no 












X 


C18 


no 


X 




X 


X 




X 


C19 


yes 














C20 


no 














C21 


no 














C22 


no 








X 






C23 


no 


X 




X 









WO 2005/052146 



PCT/US2004/039066 



- 215 - 

|C24 lyes | | | | | | | 

Of the mutants identified in sequencing, the following mutants from the arginjne 
5 library (See, Table 16-4) were found to be of interest. See the Examples below for 
additional data regarding the properties of these mutants. 



Table 16-4. Arginine Mutants of Interest 


MUTANT 


SEQUENCE 


R1 


R16QR35FR159Q 


R2 


R159Q 


R3 


R16Q R123L 


R7 


R14L R127Q R159Q 


R10B 


R14LR179Q 


R18 


R123L R127QR179Q 


R21 


R16Q R79T R127Q 


R23 


R16Q R79T 


R10 


R14L R79T 



10 Importantly, the activity results indicated that mutations in the cysteine residues 

produced ASP proteases with very low or no activity, suggesting that the disulfide bridges 
play an important role in the stability of the molecule. However, it is not intended that the 
present invention be limited to any particular mechanism(s). 

15 

EXAMPLE 17 

Expression of Homologous O. turbata Protease in S. lividans 
In this Example, expression of protease produced by O. turbata that is homologous 
to the protease 69B4 in S. lividans is described. Thus, this Example describes plasmids 
20 comprising polynucleotides encoding a polypeptide having proteolytic activity and used such 
vectors to transform a Streptomyces lividans host cell. The transformation methods used 
herein are known in the art {See e.g., U.S. Pat. No. 6,287,839; and WO 02/50245, herein 
incorporated by reference). 

The vector (i.e., plasmid) used in these experiments comprised a polynucleotide 
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encoding a protease of the present invention obtained from Oerskovia turbata DSM 20577. 
This plasmid was used to transform Streptomyces lividans. The final plasmid vector is 
referred to herein as u pSEA4CT-0.turbata." 

As with previous vectors, the construction of pSEA4CT-0.turbata made use of the 
s pSEGCT plasmid vector (See, above). 

An Aspergillus niger ( tt A4 M ) regulatory sequence operably linked to the structural 
gene encoding the Oerskovia turbata protease (Otp) was used to drive the expression of the 
protease. A fusion between the A4-regulatory sequence and the Oerskovia turbata signal- 
sequence, N-terminal prosequence and mature protease sequence (i.e., without the C- 
10 terminal prosequence) was constructed by fusion-PCR techniques known in the art, as an 
Xba\-BamH\ fragment. The polynucleotide primers for the cloning of Oerskovia turbata 
protease (Otp) in pSEA4CT were based on SEQ ID NO:67. The primer sequences used 
were: 

is A4-turb Fw 

5 , -CAGAGACAGACCCCCGGAGGTAACCATGGCACGATCATTCTGGAGGACGC-3 , (SEQ 
IDNO:613) 

A4- turb RV 

20 S'-GCGTCCTCCAGAATGATCGTGCCATGGTTACCTCCGGGGGTCTGTCTCTG-S' (SEQ 
IDNO:614) 

A4- turb Bam Rv 

S'-ATCCGCTCGCGGATCCCCATTGTCAGCTCGGGCCCCCACCGTCAGAGGTCACGAG- 
25 3' (SEQ ID NO:615) 

A4-Xba1-FW 

S^GCAGCCTGAACTAGTrGCGATCCTCTAGAGATCGAACTTCAT-S' (SEQ ID NO:616) 

30 The fragment was ligated into plasmid pSEA4CT digested with Xba\ and SamHI, 

resulting in plasmid pSEA4CT-0.turbata. 

The host Streptomyces lividans TK23 was transformed with plasmid vector 
pSEA4CT-0.turbata using the protoplast method described in the previous Example (i.e., 
using the method of Hopwood etal, supra). 

35 The transformed culture was expanded to provide two fermentation cultures in TS* 

medium. The composition of TS* medium was (g/L) tryptone (Difco) 16, soytone (Difco) 4, 
casein hydrolysate (Merck) 20, K 2 HP0 4 10, glucose 15, Basildon antifoam 0.6, pH 7.0. At 
various time points, samples of the fermentation broths were removed for analysis. For the 
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purposes of this experiment, a skim milk procedure was used to confirm successful cloning. 
30 uL of the shake flask supernatant was pipetted in punched out holes in skim milk agar 
plates and incubated at 37°C. 

The incubated plates were visually reviewed after overnight incubation for the 
presence of clearing zones (halos) indicating the expression of proteolytic enzyme. For 
purposes of this experiment, the samples were also assayed for protease activity and for 
molecular weight (SDS-PAGE). At the end of the fermentation, full length protease was 
observed by SDS-PAGE. 

A sample of the fermentation broth was assayed as follows: 10uL of the diluted 
supernatant was collected and analyzed using the Dimethylcasein Hydrolysis Assay 
described in Example 1 . The assay results of the fermentation broth of 2 clones clearly 
show that the polynucleotide from Oerskovia turbata encoding a polypeptide having 
proteolytic activity was expressed in Streptomyces lividans. 



EXAMPLE 18 

Expression of Homologous Cellulomonas and Cellulosimicrobium 

Proteases In S. lividans 

In this Example, expression of proteases produced by Cellulomonas cellasea DSM 
20118 and Cellulosimicrobium cellulans DSM 204244 that are homologous to the protease 
69B4 in S. lividans is described. Thus, this Example describes plasmids comprising 
polynucleotides encoding a polypeptide having proteolytic activity and used such vectors to 
transform a Streptomyces lividans host cell. The transformation methods used herein are 
known in the art (See e.g., U.S. Pat. No. 6,287,839; and WO 02/50245, herein incorporated 
by reference). 

The final plasmid vectors are referred to as pSEA4CT-C.cellasea and pSEA4CT- 
Cm.cellulans. The construction of pSEA4CT-C.cellasea and pSEA4CT-Cm.cellulans made 
use of the pSEGCT plasmid vector described above. 

An Aspergillus niger(" A4") regulatory sequence operably linked to the structural 
gene encoding the Cellulomonas cellasea mature protease (Ccp) or alternatively, the 
structural gene encoding the Cellulosimicrobium cellulans mature protease (Cmcp) was 
used to drive the expression of the protease. A fusion between the A4-regulatory sequence 
and the 69B4 protease signal-sequence, N-terminal prosequence of the 69B4 protease 
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gene and mature sequence of the native protease gene obtained from genomic DNA of a 
strain of Micrococcineae (herein, Cellulomonas cellasea or Cellulosimicrobium cellulans) 
was constructed by fusion-PCR techniques, as a Xba\-BamH\ fragment. The polynucleotide 
primers for the cloning of Cellulomonas cellasea protease (Ccp) in pSEA4CT were based on 
5 SEQ ID NO:63, and are as follows: 

Asp-npro fw-cell 
5'- 

AGACCGACGAGACCCCGCGGACCATGGTCGACGTCATCGGCGGCAACGCGTACTAC- 
10 3' (SEQ ID NO:617) 

Cell-BH1-rv 
5'- 

TCAGCCGATCCGCTCGCGGATCCCCATTGTCAGCCCAGGACGAGACGCAGACCGTA-3' 
is (SEQIDNO:618) 

Asp-npro rv-cell 
5'- 

GTAGTACGCGTTGCCGCCGATGACGTCGACCATGGTCCGCGGGGTCTCGTCGGTCT- 
20 3' (SEQ ID NO:619) 

Xba-1 fw A4 

5'-GCAGCCTGAACTAGTTGCGATCCTCTAGAGATCGAACTTCATGTTCGA-3* (SEQ ID 
NO:620) 

25 

The polynucleotide primers for the cloning of Cellulosimicrobium cellulans ( protease 
(Cmcp) in pSEA4CT were based on SEQ ID NO:71 , and are as follows, 

ASP-npro fw cellu 

so 5'-ACCGAGGAGACCCCGCGGACCATGCACGGCGACGTGCGCGGCGGCGACCGCTA-3' 
(SEQ ID NO:621) 

ASP-npro rv cellu 

5'-TAGCGGTCGCCGCCGCGCACGTCGCCGTGCATGGTCCGCGGGGTCTCGTCGGT-3' 
35 (SEQ ID NO:622) 
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Cellu-BH1-rv 
5'- 

TCAGCCGATCCGCTCGCGGATCCCCATTGTCAGCGAGCCCGACGAGCGCGCTGCCCG 
AC-3' (SEQ ID NO:623) 

Xba-1 fwA4 

5'-GCAGCCTGAACTAGTTGCGATCCTCTAGAGATCGAACTTCATGTTCGA-3' (SEQ ID 
NO:620) 

The host Streptomyces lividans TK23 was transformed with plasmid vector 
pSEA4CT using the protoplast method described above (i.e., Hopwood et ai, supra). The 
transformed culture was expanded to provide two fermentation cultures in TS* medium. The 
composition of TS* medium was (g/L) tryptone (Difco) 16, soytone (Difco) 4, casein 
hydrolysate (Merck) 20, K 2 HP0 4 10, glucose 15, Basildon antifoam 0.6, pH 7.0. At various 
time points, samples of the fermentation broths were removed for analysis. For the 
purposes of this experiment, a skim milk procedure was used to confirm successful cloning. 
30 uL of the shake flask supernatant was pipetted in punched out holes in skim milk agar 
plates and incubated at 37°C. 

The incubated plates were visually reviewed after overnight incubation for the 
presence of clearing zones (halos) indicating the expression of proteolytic enzyme. For 
purposes of this experiment, the samples were also assayed for protease activity and for 
molecular weight (SDS-PAGE). At the end of the fermentation full length protease was 
observed by SDS-PAGE. 

A sample of the fermentation broth was assayed as follows: 10uLof the diluted 
supernatant was taken and added to 190 pL AAPF substrate solution (cone. 1 mg/ml, in 0.1 
M Tris/0.005% Tween 80, pH 8.6). The rate of increase in absorbance at 410 nm due to 
release of p-nitroaniline was monitored (25°C). 

As in previous Examples, the results obtained clearly indicated that the 
polynucleotide from Cellulomonas cellasea or from Cellulosimlcrobium cellulans, both 
encoding polypeptides having proteolytic activity were expressed in Streptomyces lividans. 

EXAMPLE 19 

Determination of the Crystal Structure of ASP Protease 

In this Example, methods used to determine the crystal structure of ASP protease 
are described. Indeed, high quality single crystals were obtained from purified ASP 



WO 2005/052146 PCT/US2004/039066 



-220- 

protease. The crystallization conditions were as follows: 25% PEG 8000, 0.2M ammonium 
sulphate, and 15% glycerol. These crystallization conditions are cryo-protective, so transfer 
to a cryoprotectant was not required. The crystals were frozen in liquid nitrogen, and kept 
frozen during data collection using an Xstream (Molecular Structure). Data were collected 

5 with a R-axis IV (Molecular Structure), equipped with focusing mirrors. X-ray reflection data 
were obtained to 1 .9A resolution. The space group was P2 1 2 1 2 1 , with cell dimensions 
a=35.65A, b=51 .82 A and c=76.86A. There was one molecule per asymmetric unit. 

The crystal structure was solved using the molecular replacement method. The 
program used was X-MR (Accelrys Inc.). The starting model for the molecular replacement 

10 calculations was Streptogrisin. It is clear from the electron density map obtained from X-MR 
that the molecular replacement solution is correct. Thus, 98% of the model was built 
correctly, with some minor errors that were fixed manually. The R-factor for data to 1 ,9A 
was 0.23. 

The structure was found to largely consist of p-sheets, with 2 very short a-helices, 
is and a longer helix toward the C : terminal end. There are two sets of p-sheets, with a 
considerable interface between them. The active-site is found in a cleft formed at this 
interface. The catalytic triad is formed by His 32, Asp 56, and Ser 137. Table 19-1 provides 
the atomic coordinates identified for ASP. 
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Table 19-1 


Atomic Coordinates for ASP 












CRYSTl 


35. 


770 


51.730 




76.650 90J 


90 90.00 


90.00 


P212121 






ATOM 


1 


N 


PHE A 


1 


2.421 


18.349 


15.176 


1.00 


16.78 


N 


25 


ATOM 


2 


CA 


PHE A 


1 


3.695 


18.087 


15.905 


1.00 


18.18 


C 




ATOM 


3 


CB 


PHE A 


1 


4.875 


18.550 


15.048 


1.00 


16.73 


C 




ATOM 


4 


C 


PHE A 


1 


3.700 


18.810 


17.249 


1.00 


.16.36 


C 




ATOM 


5 


O 


PHE A 


1 


3.443 


20.011 


17.315 


1.00 


17.91 


O 




ATOM 


6 


CG 


PHE A 


1 


6.214 


18.292 


15.664 


1.00 


17.42 


C 


30 


ATOM 


7 


CD2 


PHE A 


1 


6.955 


17.180 


15.296 


1.00 


19.42 


C 




ATOM 


8 


CD1 


PHE A 


1 


6.736 


19.160 


16.611 


1.00 


16.13 


C 




ATOM 


9 


CE2 


PHE A 


1 


8.200 


16.933 


15.863 


1.00 


18.08 


C 




ATOM 


10 


CE1 


PHE A 


1 


7.977 


18.922 


17.18J0 


1.00 


18.34 


C 




ATOM 


11 


CZ 


PHE A 


1 


8.710 


17.807 


16.806 


1.00 


19.32 


c 


35 


ATOM 


12 


N 


ASP A 


2 


3.984 


18.076 


18.321 


1.00 


13.94 


N 




ATOM 


13 


CA 


ASP A 


2 


4.015 


18.670 


19.654 


1.00 


15.04 


C 




ATOM 


14 


CB 


ASP A 


2 


3.527 


17.677 


20.714 


1.00 


15.13 


C 




ATOM 


15 


C 


ASP A 


2 


5.403 


19.149 


20.063 


1.00 


14.43 


C 




ATOM 


16 


O 


ASP A 


2 


6.381 


18.408 


19.966 


1.00 


11.44 


0 


40 


ATOM 


17 


CG 


ASP A 


2 


2.088 


17.243 


20.502 


1.00 


18.25 


C 




ATOM 


IB 


OD2 


ASP A 


2 


1.721 


16.150 


20.986 


1.00 


19.05 


0 




ATOM 


19 


OD1 


ASP A 


2 


1.320 


17.996 


19.874 


1.00 


15.33 


0 




ATOM 


20 


N 


VAL A 


3 


5.479 


20.393 


20.523 


1.00 


12.30 


N 




ATOM 


21 


CA 


VAL A 


3 


6.740 


20.979 


20.959 


1.00 


11.83 


C 


45 


ATOM 


22 


CB 


VAL A 


3 


6.812 


22.480 


20.603 


1.00 


11.52 


C 




ATOM 


23 


C 


VAL A 


3 


6.766 


20.795 


22.470 


1.00 


13.77 


C 




ATOM 


24 


0 


VAL A 


3 


5.912 


21.321 


23.183 


1.00 


11.14 


0 




ATOM 


25 


CGI 


VAL A 


3 


7.987 


23.133 


21.309 


1.00 


15.13 


C 




ATOM 


26 


CG2 


VAL A 


3 


6.968 


22.637 


19.101 


1.00 


14.21 


C 


50 


ATOM 


27 


CB 


ILE A 


4 


7.561 


18.267 


24.642 


1.00 


14.73 


C 




ATOM 


28 


CG2 


ILE A 


4 


7.799 


17.929 


26.099 


1.00 


14.20 


c 




ATOM 


29 


CGI 


ILE A 


4 


6.103 


17.995 


24.267 


1.00 


16.79 


c 




ATOM 


30 


GDI 


ILE A 


4 


5.774 


16.518 


24.166 


1.00 


19.32 


c 
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ATOM 


31 


C 


ILE 


A 


4 


9.334 




ATOM 


32 


O 


ILE 


A 


4 


10.289 




ATOM 


33 


N 


ILE 


A 


4 


7.745 




ATOM 


34 


CA 


ILE A 


4 


7.903 


5 


ATOM 


35 


N 


GLY A 


5 


9.475 




ATOM 


36 


CA 


GLY A 


5 


10.800 




ATOM 


37 


C 


GLY A 


5 


11.700 




ATOM 


38 


O 


GLY A 


5 


11.256 




ATOM 


39 


N 


GLY A 


6 


12.966 


10 


ATOM 


40 


CA 


GLY A 


6 


13.917 




ATOM 


41 


C 


GLY 


A 


6 


14.070 




ATOM 


42 


O 


GLY 


A 


6 


15.020 




ATOM 


43 


N 


ASN 


A 


7 


13.131 




ATOM 


44 


CA 


ASN 


A 


7 


13.168 


15 


ATOM 


45 


CB 


ASN 


A 


7 


11.780 




ATOM 


46 


CG 


ASN 


A 


7 


10.897 




ATOM 


47 


OD1 


ASN 


A 


7 


9.715 




ATOM 


48 


ND2 


ASN 


A 


7 


11.456 




ATOM 


49 


C 


ASN A 


7 


14.130 


20 


ATOM 


50 


O 


ASN A 


7 


14.424 




ATOM 


51 


N 


ALA A 


8 


14.608 




ATOM 


52 


CA 


ALA A 


8 


15.532 




ATOM 


53 


CB 


ALA A 


8 


16.336 




ATOM 


54. 


C 


ALA A 


8 


14.766 


25 


ATOM 


55 


O 


ALA 


A 


8 


13.567 




ATOM 


56 


N 


TYR 


A 


9 


15.468 




ATOM 


57 


CA 


TYR 


A 


9 


14.899 




ATOM 


58 


CB 


TYR 


A 


9 


14.279 




ATOM 


59 


CG 


TYR 


A 


9 


15.216 


30 


ATOM 


60 


CD2 


TYR 


A 


9 


15.485 




ATOM 


61 


CE2 


TYR 


A 


9 


16.302 




ATOM 


62 


CD1 


TYR 


A 


9 


15.791 




ATOM 


* 63 


CE1 


TYR 


A 


9 


16.604 




ATOM 


64 


CZ 


TYR A 


9 


16.857 


35 


ATOM 


65 


OH 


TYR 


A 


9 


17.661 




ATOM 


66 


C 


TYR A 


9 


16.127 




ATOM 


67 


0 


TYR 


A 


9 


17.247 




ATOM 


68 


N 


THR 


A 


10 


15.946 




ATOM 


69 


CA 


THR 


A 


10 


17.105 


40 


ATOM 


70 


CB 


THR 


A 


10 


17.114 




ATOM 


71 


OG1 


THR 


A 


10 


15.952 




ATOM 


72 


CG2 


THR 


A 


10 


17.121 




ATOM 


73 


C 


THR 


A 


10 


17.267 




ATOM 


74 


O 


THR 


A 


10 


16.299 


45 


ATOM 


75 


N 


ILE 


A 


11 


18.520 




ATOM 


76 


CA 


ILE 


A 


11 


18.889 




ATOM 


77 


CB 


ILE 


A 


11 


19.649 




ATOM 


78 


CG2 


ILE 


A 


11 


19.919 




ATOM 


79 


CGI 


ILE 


A 


11 


18.825 


50 


ATOM 


80 


CD1 


ILE 


A 


11 


19.560 




ATOM 


81 


C 


ILE 


A 


11 


19.802 




ATOM 


82 


O 


ILE 


A 


11 


20.913 




ATOM 


83 


N 


GLY 


A 


12 


19.330 




ATOM 


84 


CA 


GLY 


A 


12 


20.132 


55 


ATOM 


85 


C 


GLY 


A 


12 


20.359 




ATOM 


66 


0 


GLY 


A 


12 


• 21.395 




ATOM 


87 


N 


GLY 


A 


13 


19.391 




ATOM 


88 


CA 


GLY 


A 


13 


19.509 




ATOM 


89 


C 


GLY 


A 


13 


20.352 


60 


ATOM 


90 


0 


GLY 


A 


13 


20.470 




ATOM 


91 


N 


ARG 


A 


14 


20.931 




ATOM 


92 


CA 


ARG A 


14 


21.772 




ATOM 


93 


CB 


ARG 


A 


14 


23.017 




ATOM 


94 


C 


ARG 


A 


14 


21.030 


65 


ATOM 


95 


O 


ARG A 


14 


20.423 




ATOM 


96 


CG 


ARG 


A 


14 


24.009 




ATOM 


97 


CD 


ARG 


A 


14 


24.879 




ATOM 


98 


NE 


ARG 


A 


14 


25.964 




ATOM 


99 


CZ 


ARG 


A 


14 


25.802 


70 


ATOM 


100 


NH1 


ARG 


A 


14 


26.852 




ATOM 


101 


NH2 


ARG 


A 


14 


24.592 




ATOM • 


102 


N 


SER 


A 


15 


21.075 




ATOM 


103 


CA 


SER 


A 


15 


20.407 




ATOM 


104 


CB 


SER 


A 


15 


20.033 


75 


ATOM 


105 


C 


SER 


A 


15 


21.402 




ATOM 


106 


0 


SER A 


15 


21.966 




ATOM 


107 


OG 


SER A 


15 


19.311 




ATOM 


108 


N 


ARG A 


16 


21.625 
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20.031 24.816 1.00 14.04 c 

19.660 24.140 1.00 11.09 O 

20.033 22.945 1.00 10.83 N 

19.750 24.365 1.00 13.46 c 

20.681 25.965 1.00 11.82 N 

20.995 26.467 1.00 9.81 C 

19.785 26.644 1.00 11.77 c 

18.737 27.114 1.00 9.20 o 

19.927 26.255 1.00 10.03 n 

18.836 26.397 1.00 8.54 c 

17.979 25.156 1.00 9.57 c 

17.200 25.042 1.00 7.69 o 

18.119 24.224 1.00 9.01 N 

17.359 22.985 1.00 10.51 c 

17.293 22.349 1.00 14.65 C 

16.250 22.981 1.00 10.35 C 

16.144 22.644 1.00 13.61 O 

15.470 23.896 1.00 6.66 N 

17.952 21.976 1.00 12.30 C 

19.146 21.991 1.00 15.93 0 

17.107 21.079 1.00 11.08 N 

17.564 20.063 1.00 14.32 c 

16.392 19.541 1.00 14.61 C 

18.202 18.914 1.00 11.23 C 

17.987 18.747 1.00 12.54 O 

19.021 18.145 1.00 9.75 N 

19.691 16.988 1.00 12.42 C 

21.059 17.334 1.00 12.79 C 
22.150 17.790 1.00 14.12 C 
22.333 19.139 1.00 10.17 C 
23.366 19.572 1.00 12.49 C 
23.029 16.877 1.00 9.02 C 
24.066 17.294 1.00 10.92 C 
24.230 18.644 1.00 13.93 C 
25.261 19.070 1.00 12.50 O 
19.792 16.101 1.00 12.21 C 
19.589 16.583 1.00 11.38 O 
20.055 14.816 1.00 11.44 N 
20.144 13.946 1.00 13.35 C 
18.998 12.916 1.00 14.07 C 
19.098 12.086 1.00 13.63 O 
17.648 13.620 1.00 12.60 C 
21.452 13.194 1.00 14.66 C 
22.161 12.907 1.00 12.64 O 
21.749 12.881 1.00 14.05 N 
22.954 12.157 1.00 18.00 C 
23.931 13.068 1.00 17.58 C 

.25.230 12.323 1.00 20.00 C 

24.212 14.327 1.00 21.47 C 

25.031 15.377 1.00 23.61 C 

22.485 11.030 1.00 16.40 C 

22.014 11.278 1.00 17.72 O 

22.603 9.794 1.00 18.83 N 

22.155 8.673 1.00 17.69 C 

20.659 8.791 1.00 18.86 C 

20.141 8.376 1.00 19.71 O 

19.964 9.380 1.00 17.62 N 

18.525 9.529 1.00 16.37 C 

18.060 10.703 1.00 17.10 C 
16.861 10.946 1.00 15.94 O 
19.002 11.438 1.00 17.27 N 
18.667 12.585 1.00 15.15 C 
19.558 12.586 l.OO 19.68 C 
18.842 13.908 1.00 16.27 C 
19.882 14.159 1.00 12.16 O 
19.273 13.699 1.00 25.94 C 
18.069 13.393 1.00 31.69 C 
17.928 14.360 1.00 40.26 N 
17.572 15.630 1.00 42.65 C 
17.483 16.435 1.00 45.09 N 
17.302 16.091 1.00 41.89 N 
17.821 14.756*1.00 14.36 N 
17.892 16.047 1.00 18.05 C 
16.488 16.524 1.00 19.52 C 
18.533 17.011 1.00 18.51 C 
17.870 17.882 1.00 16.89 O 
16.542 17.742 1.00 24.25 O 
19.829 16.842- 1.00 15.76 N 
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ATOM 


109 


CA 


ARG 


A 


16 


22.560 




ATOM 


110 


CB 


ARG 


A 


16 


23.077 




ATOM 


111 


C 


ARG 


A 


16 


22.006 




ATOM 


112 


O 


ARG 


A 


16 


22.760 


5 


ATOM 


113 


CG 


ARG 


A 


16 


23.892 
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20.544 17.695 1.00 18.30 C 
21.795 16.976 1.00 22.82 C 

20.952 19.050 1.00 17.05 C 

21.064 20.015 1.00 11.60 O 
21.498 15.729 1.00 30.78 C 

22.758 15.131 1.00 36.12 C 
23.756 14.789 1.00 41.88 N 
24.839 14.058 1.00 44.68 C 
25.057 13.579 1.00 46.43 N 
25.698 13.796 1.00 44.09 N 
21.152 19.130 1.00 12.26 N 
21.562 20.388 1.00 11.02 C 
23.079 20.394 1.00 11.05 C 
20.946 20.756 1.00 8.62 C 
20.154 20.008 1.00 10.24 O 
23.945 20.503 1.00 10.83 S 

21.338 21.926 1.00 9.44 N 
20.849 22.441 1.00 10.14 C 
20.053 23.726 1.00 11.06 C 
19.042 23.516 1.00 11.13 O 
22.004 22.736 1.00 10.28 C 
23.152 22.882 1.00 12.80 O 
21.689 22.806 1.00 8.87 N 
22.676 23.087 1.00 9.04 C 
22.07.0 22.951 1.00 9.94 C 
23.126 23.287 1.00 10.60 C 
21.514 21.543 1.00 12.49 C 
22.554 20.439 1.00 10.46 C 
23.154 24.530 1.00 9.36 C 
22.346 25.442 1.00 7.81 O 
24.466 24.729 1.00 6.59 N 
25.024 26.067 1.00 7.48 C 
25.027 26.649 1.00 10.12 C 
24.128 27.400 1.00 9.28 O 
26.037 26.293 1.00 11.70 N 
26.132 26.770 1.00 9.99 C 
27.019 28.009 1.00 12.23 C 
26.455 29.197 1.00 12.14 C 
25.493 29.985 1.00 10.45 C 
26.873 29.517 1.00 11.10 C 

24.953 31.078 1.00 9.63 C 

26.339 30.606 1.00 10.44 C 
25.377 31.390 1.00 5.44 C 
26.721 25.692 1.00 11.93 C 
27.500 24.861 1.00 11.86 O 
26.346 25.709 1.00 8.59 N 
26.861 24.722 1.00 10.98 C 
25.920 24.580 1.00 9.33 C 

28.233 25.200 1.00 9.72 C 
28.431 26.390 1.00 10.20 O 
29.178 24.270 1.00 9.39 N 
30.542 24.579 1.00 11.79 C 

31.545 24.567 1.00 8.77 C 
31.176 25.644 1.00 12.30 . C 
31.557 23.195 1.00 9.56 C 
30.943 23.496 1.00 12.96 C 

30.234 22.507 1.00 15.51 O 
32.066 23.668 . 1.00 15.64 N 
32.472 22.642 1.00 18.48 C 
33.772 23.048 1.00 23.96 C 
32.661 21.319 1.00 18.42 C 
33.410 21.251 1.00 16.60 O 
34.949 23.182 1.00 23.94 C 
34.951 24.025 1.00 23.82 O 
35.964 22.345 1.00 25.51 N 
31.956 20.278 1.00 19.39 N 
32.086 18.978 1.00 18.25 C 
31.106 18.649 1.00 18.73 C 

31.065 17.515 1.00 18.70 O 
30.318 19.624 1.00 14.44 .N 
29.362 19.348 1.00 15.00 C 
28.822 20.602 1.00 11.05 C 
28.457 21.554 1.00 10.66 O 

28.759 20.599 1.00 11.66 N 
28.2*8 21.761 1.00 11.72 C 
26.748 21.679 1.00 10.14 C 
28.960 21.934 1.00 10.62 C 
29.509 20.985 1.00 12.74 O 
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27.329 9.811 1.00 16.44 c 

26.956 8.397 1.00 16.50 c 

26.146 7.746 1.00 22.08 O 

28.209 7.578 1.00 17.88 C 

26.052 10.622 1.00 14.04 c 

25.467 10.669 1.00 13.48 O 

25.626 11.256 1.00 14.41 n 

24.421 12.072 1.00 12.76 C 

24.735 13.536 1.00 13.70 C 

25.260 13.601 1.00 11.68 O 

25.752 14.118 1.00 10.97 c 

23.399 11.566 1.00 12.70 C 

23.717 10.773 1.00 15.30 O 

22.164 12.033 1.00 12.69 N 

21.062 11.667 1.00 13.39 C 

20.346 12.986 1.00 13.08 c 

20.403 13.912 1.00 13.32 O 

20.121 10.682 1.00 12.91 C 

19.680 13.075 1.00 13.98 N 

18.966 14.296 1.00 15.22 C 

19.886 15.507 1.00 15.41 c 

19.580 16.426 1.00 14.69 O 

17.727 14.507 1.00 18.61 C 

16.826 13.282 1.00 22.16 C 

16.685 12.536 1.00 20.39 O 

16.192 13.085 1.00 21.80 N 

20.994 15.558 1.00 12.16 N 

21.449 14.579 l.OO 13.99 C 

22.178 15.454 1.00 14.60 C 

22.387 13.508 1.00 14.85 C 

22.950 13.633 1.00 12.84 O 

21.862 16.751 1.00 li.35 C 

22.940 16.356 1.00 12.54 C 
22.556 12.454 1.00 12.78 N 
23.436 11.370 1.00 13.48 C 
23.349 10.217 1.00 15.07 C 
24.818 12.010 1.00 13.36 C 
25.127 12.721 1.00 12.32 O 
22.082 9.565 1.00 17.67 O 
24.473 9.216 1.00 14.97 c 
25.631 11.787 1.00 12.10 N 
26.958 12.369 1.00 13.77 C 
27.824 11.865 1.00 12.84 C 
27.412 11.006 1.00 14.31 O 
29.033 12.404 1.00 12.18 N 
29.970 12.001 1.00 15.03 C 
30.953 10.952 1.00 15.90 C 
30.219 9.884 1.00 20.72 O 
31.821 10.392 1.00 18.41 C 
30.777 13.203 1.00 13.19 C 
31.331 13.944 1.00 10.72 O 
30.835 13.407 1.00 11.27 N 
31.596 14.527 1.00 10.95 C 
31.508 14.596 1.00 11.26 C 
30.306 15.346 1.00 12.89 C 
30.442 16.633 1.00 8.64 " C 
29.046 14.767 1.00 12.80 C 
29.342 17.331 1.00 12.73 C 

27.941 15.457 1.00 12.73 C 
28.088 16.740 1.00 14.16 C 
33.041 14.291 1.00 12.22 C 
33.563 13.182 1.00 13.19 O 
33.673 15.330 1.00 11.62 N 
35.059 15.240 1.00 12.91 C 
35.126 15,261 1.00 13.93 C 
35.856 16.400 1.00 15.66 C 
36.598 17.072 1.00 21.12 O 
35.700 16.622 1.00 15.68 N 
36.407 17.701 1.00 16.25 C 
35.500 18.352 1.00 15.88 C' 
34.403 18.799 1.00 13.45 O 
35.947 18.405 1.00 13.85 N 
35.144 19.012 1.00 13.96 C 
34.156 17.984 1.00 17.08 C 
33.365 18.541 1.00 14.72 O 
36.026 19.543 1.00 16.90 C 
36.894 18.835 1.00 16.85 0 
35.802 20.801 1.00 15.23 N 
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36.561 21.447 1.00 14.67 c 

37.431 22.570 1.00 15.96 c 

38.172 23.187 1.00 18.34 0 

35.622 22.017 1.00 11.74 c 

34.888 22.969 1.00 12.45 o 

35.655 21.419 1.00 8.44 N 

34.822 21.842 1.00 11.84 c 

33.344 21.557 1.00 9.85 c 

32.450 21.837 1.00 14.45 c 

32.056 23.133 1.00 14.77 c 

32.037 20.808 1.00 14.93 c 

31.267 23.400 1.00 12.39 C 

31.250 21.067 1.00 13.03 c 

30.864 22.364 1.00 15.39 C 

35.245 21.051 1.00 11.09 C 

35.416 19.830 1.00 10.06 0 

35.427 21.732 1.00 13.84 N 

35.610 21.055 1.00 13.82 C 

35.257 23.177 1.00 11.97 C 

35.114 23.319 1.00 15.91 C 

35.957 22.201 1.00 16.14 C 

36.429 23.972 1.00 13.65 C 

37.085 23.516 1.00 12.98 0 

36.706 25.144 1.00 13.22 N 

37.778 25.975 1.00 13.41 C 

37.112 26.781 1.00 13.11 c 

36.931 27.995 1.00 12.76 0 

36.740 26.083 1.00 13.05 N 

36.013 26.681 1.00 14.39 C 

36.681 26.396 1.00 12.65 C 

38.153 26.682 1.00 11.23 C 

38.967 25.784 1.00 16.09 0 

38.516 27.933 1.00 11.47 N 

34.721 25.888 1.00 15.51 C 

34.485 25.123 1.00 11.36 0 

33.890 26.072 1.00 14.13 N 

32.631 25.346 1.00 11.90 C 

31.522 25.993 1.00 9.70 c 

30.320 25.070 1.00 9.97 C 

29.330 25.459 1.00 12.57 0 

30.365 23.938 1.00 8.45 0 

32.216 25.279 1.00 9.86 C 

31.254 25.920 1.00 11.82 O 

32.969 24.509 1.00 8.71 N 

32.677 24.351 1.00 10.51 C 

33.480 25.348 1.00 12.30 C 

34.992 25.271 1.00 12.51 C 

35.708 26.291 1.00 11.12 C 
37.094 26.244 1.00 11.36 C 

35.706 24.197 1.00 13.29 C 
37.096 24.144 1.00 10.62 C 
37.783 25.169 1.00 13.60 C 
39.162 25.122 1.00 12.04 0 
32.963 22.933 1.00 10.26 C 
33.674 22.172 1.00 10.59 0 
32.393 22.578 1.00 9.32 N 
32.583 21.254 1.00 7.41 C 
31.732 20.241 1.00 7.89 C 
32.207 21.271 1.00 10.96 C 
31.510 22.175 1.00 11.10 O 
32.702 20.277 1.00 11.71 N 
32.435 20.136 1.00 12.26 c 

33.707 20.333 1.00 10.18 C 
33.576 19.859 1.00 11.71 C 
34.297 18.764 1.00 11.51 C 

32.709 20.490 1.00 10.35 C 
34.156 18.307 1.00 15.38 C 
32.563 20.044 1.00 14.84 C 
33.286 18.949 1.00 13.16 C 
31.931 18.722 1.00 11.77 .C 
32.507 17.771 1.00 13.80 0 
30.852 18.590 1.00 10.53 N 
30.278 17.285 1.00 11.14 C 
28.796 17.209 1.00 15.19 C 
28.212 15.856 1.00 10.78 C 
28.670 17.421 1.00 11.44 C 
30.363 17.082 1.00 11.30 C 
29.905 17.924 1.00 8.90 O 
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30.964 15.979 1.00 12.67 N 

31.083 15.716 1.00 11.18 C 

32.314 14.844 1.00 12.63 C 

32.398 14.379 1.00 17.12 C 

33.527 13.376 1.00 20.85 C 

33.614 12.971 1.00 24.18 N 

32.744 12.171 1.00 24.05 C 

32.904 11.884 1.00 25.34 N 

31.708 11.670 1.00 25.91 N 

29.831 15.011 1.00 12.67 C 

29.316 14.096 1.00 11.46 0 

29.333 15.461 1.00 13.58 N 

28.147 14.865 1.00 13.24 C 

26.995 15.884 1.00 11.66 C 

27.450 17.007 1.00 13.55 0 

26.485 16.349 1.00 13.26 C 

28.558 14.335 1.00 13.42 C 

29.568 14.770 1.00 16.80 0 

27.778 13.406 1.00 16.51 N 

28.108 12.834 1.00 15.85 C 

27.033 12.894 1.00 16.64 c 

26.432 13.938 1.00 12.21 0 

26.788 11.753 1.00 15.51 N 

25.810 11.663 1.00 15.. 84 . c 

24.378 11.977 1.00 15.00 C 

23.977 11.742 1.00 15.60 O 

25.866 10.279 1.00 16.27 C 

23.614 12.510 1.00 17.17 N 

22.217 12.828 1.00 19.41 C 

21.946 13.953 1.00 19.21 C 

20.790 14.234 1.00 22.10 O 

23.001 14.603 1-00 15.20 N 

22.844 15.697 1.00 15.99 C 

23.746 15.501 1.00 15.02 C 

23.195 17.016 1.00 18.46 C 

24.349 17.257 1.00 16.96 O 

23.602 16.688 1.00 13.36 C 

23.375 14.195 1.00 11.46 C 

22.193 17.866 1.00 15.34 N 

22.407 19.158 1.00 16.12 C 

21.177 19.539 1.00 21.01 C 

22.704 20.228 1.00 17.24 C 

21.862 20.554 1.00 17.97 O 

20.748 18.431 1.00 29.21 C 

21.527 17.976 1.00 33.32 0 

19.505 17.982 1.00 33.03 N 

23.915 20.767 1.00 13.94 N 

24.378 21.807 1.00 14.43 C 

25.896 21.707 1.00 13.70 C 

23.985 23.178 1.00 15.01 C 

24.568 23.664 1.00 16.08 0 

26.395 20.358 1.00 8.95 C 

27.910 20.284 1.00 8.47 C 

25.931 20.179 1.00 12.27 C 

23.005 23.805 1.00 12.99 N 

22.529 25.119 1.00 12.18 C 

20.997 25.134 1.00 12.27 C 

20.310 24.029 1.00 16.54 C 

18.802 24.113 1.00 17.85 C 

20.679 24.170 1.00 19.65 C 

23.050 26.307 1.00 14.39 C 

23.239 26.228 1.00 14.53 0 

23.271 27.416 1.00 12.89 N 

23.761 28.635 1.00 14.83 C 

24.547 29.457 1.00 18.71 C 

22.519 29.391 1.00 12.67 C 

22.293 30.523 1.00 11.15 O 

21.711 28.742 1.00 13.59 N 

20.483 29.334 1.00 14.04 C 

19.282 28.809 1.00 14.08 C 

19.283 29.157 1.00 17.65 C 
18.099 28.560 1.00 19.50 C 
17.592 29.143 1.00 24.87 0 
17.658 27.386 1.00 17.48 N 
20.255 29.011 1.00 16.23 C 
20.786 28.035 1.00 15.48 O 
19.451 29.840 1.00 13.56 N- 
19.133 29.648 1.00 12.57 C 
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9.792 19.754 30.748 1.00 

11.193 19.162 30.677 1.00 

9.862 21.271 30.563 1.00 

9.007 17.610 29.695 1.00 

8.415 16.968 30.565 1.00 

9.736 17.036 28.746 1.00 

9.913 15.586 28.673 1.00 

10.633 15.229 27.369 1.00 

10.598 13.743 27.065 1.00 

10.411 12.916 27.959 1.00 

10.790 13.397 25.798 1.00 

10.751 15.098 29.863 1.00 

11.854 15.597 30.092 1.00 

10.239 14.137 30.631 1.00 

11.010 13.640 31.766 1.00 

10.109 13.275 32.958 1.00 

9.162 12.126 32.662 1.00 

9.432 11.274 31.815 1.00 

8.048 12.088 33.384 1.00 

11.853 12.435 31.359 1.00 

12.528 11.823 32.189 1.00 

11.813 12.115 30.069 1.00 

12.556 10.998 29.495 1.00 

14.039 11.363 29.386 1.00 

14.313 12.223 28.170 1.00 

14.424 11.652 26.907 1.00 

14.591 12.435 25.775 1.00 

14.381 13.608 28.271 1.00 

14.545 14.402 27.142 1.00 

14.648 13.805 25.898 1.00 

14.793 14.579 24.770 1.00 

12.380 9.652 30.188 1.00 

13.298 8.835 30.228 1.00 

11.185 9.433 30.723 1.00 

10.846 8.193 31.411 1.00 

10.811 8.390 32.926 1.00 

12.121 8.424 33.457 1.00 

9.470 7.775 30.919 1.00 

8.843 6.868 31.473 1.00 

9.013 8.452 29.870 1.00 

7.715 8.156 29.295 1.00 

6.649 9.128 29.752 1.00 

5.464 8.942 29.470 1.00 

7.059 10.173 30.462 1.00 

6.088 11.142 30.939 1.00 

6.499 12.585 30.734 1.00 

7.481 12.876 30.041 1.00 

5.742 13.492 31.342 1.00 

6.025 14.914 31.226 1.00 

5.199 15.528 30.090 1.00 

5.711 15:i76 28.701 1.00 

4.683 14.404 27.910 1.00 

5.207 13.941 26.626 1.00 

6.223 13.094 26.493 1.00 

6.838 12.611 27.566 1.00 

6.620 12.716 25.285 1.00 

5.784 -15.695 32.510 1.00 

4.968 15.313 33.353 1.00 

6.517 16.793 32.646 1.00 

6.412 17.660 33.810 1.00 

7.806 18.040 34.349 1.00 

7.666 18.967 35.542 1.00 

8.580 16.787 34.729 1.00 

5.690 18.930 33.375 1.00 

6.106 19.588 32.421 1.00 

4.602 19.270 34.057 1.00 

3.863 20.472 33.698 1.00 

2.503 20.512 34.403 1.00 

1.422 19.659 33.760 1.00 

1.161 20.030 32.311 1.00 

0.928 21.194 31.984 1.00 

1.192 19.034 31.434 1.00 

4.654 21.722 34.067 1.00 

5.278 21.786 35.128 1.00 

4.636 22.709 33.179 1.00 

5.345 23.960 33.411 1.00 

5.973 24.494 32.107 1.00 

6.710 25.792 32.374 1.00 
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23.454 31.534 1.00 15.85 C 

24.952 33.930 1.00 18.78 C 

25.494 33.163 1.00 19.15 0 

25.175 35.240 1.00 20.30 N 

26.091 35.879 1.00 20.84 C 

25.725 37.348 1.00 20.26 C 

27.568 35.751 1.00 20.34 C 

28.405 35.594 1.00 21.44 O 

27.886 35.826 1.00 18.33 N 

29.267 35.721 1.00 15.96 C 

29.381 35.558 1.00 18.35 C 

28.433 35.117 1.00 16.24 O 

30.534 35.924 1.00 16.53 N 

30.767 35.798 1.00 14.08 C 

31.528 34.498 1.00 14.33 C 

31.510 36.988 1.00 14.07 c 

32.413 36.818 1.00 15.60 O 

32.770 34.323 1.00 18.31 C 

34.035 34.404 1.00 21.61 N 

32.936 34.064 1.00 19.95 C 

34.289 33.994 1.00 18.84 N 

34.929 34.202 1.00 22.08 C 

31.124 38.193 1.00 14.33 N 

31.758 39.405 1.00 13.94 C 

31.449 40.612 1.00 15.26 C 

31.243 39.690 1.00 14.65 C 

30.042 39.855 1.00 11.10 O 

31.904 40.347 1.00 16.89 O 

32.147 41.854 1.00 16.68 C 

32.157 39.756 1.00 15.86 N 

31.801 40.016 1.00 17.16 C 

31.152 41.375 1.00 19.39 C 

31.608 42.395 1.00 18.84 O 

33.034 39.877 1.00 17.44 C 

30.088 41.373 1.00 16.82 N 

29.352 42.584 1.00 14.95 C 

29.832 43.119 1.00 15.66 C 

30.204 42.355 1.00 15.62 O 

27.861 42.291 1.00 10.05 C 

29.822 44.447 1.00 15.05 N 

30.259 45.100 1.00 16.15 C 
30.498 46.535 1.00 16.59 C 

29.260 45.022 1.00 17.33 C 
28.076 44.741 1.00 14.79 O 
29.522 45.448 1.00 17.98 C 
29.425 46.728 1.00 15.94 C 
29.751 45.257 1.00 18.24 N 
28.894 45.221 1.00 17.32 C 
29.658 45.672 1.00 16.39 C 
28.678 45.932 1.00 19.70 C 
30.665 44.609 1.00 18.18 C 
27.770 46.211 1.00 17.15 C 
28.005 47.254 1.00 17.16 O 
26.556 45.878 1.00 13.56 N 

25.420 46.755 1.00 13.61 C 
24.583 46.314 1.00 14.54 C 
23.422 46.695 1.00 13.48 O 
25.175 45.497 1.00 12.12 N 
24.486 45.014 1.00 13.41 C 
25.457 44.239 1.00 10.87 C 
26.463 45.090 1.00 12.36 O 
23.284 44.134 1.00 13.34 C 
23.268 43.383 1.00 9.90 O 
22.274 44.252 1.00 11.16 N 
21.057 43.475 1.00 14.34 C 
19.925 .44.136 1.00 14.73 C 
21.389 42.119 1.00 14.46 C 
21.920 42.047 1.00 13.83 O 
21.092 41.048 1.00 14.27 N 
21.370 39.707 1.00 9.84 .C 
22.629 39.113 1.00 11.32 C 
23.859 39.904 1.00 9.34 C 
22.467 39.126 1.00 10.97 C 
20.209 38.763 1.00 9.69 C 

19.421 38.976 1.00 10.59 O 
20.094 37.727 1.00 10.10 N 
19.027 36.752 1.00 9.94 C 
17.983 36.845 1.00 11.63 C 
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38.608 27.026 1.00 17.74 O 

38.733 28.608 1.00 16.26 N 

40.190 28.592 1.00 14.31 C 

40.681 29.769 1.00 19.36 C 

42.198 29.866 1.00 25.15 C 

42.891 30.315 1.00 30.84 C 

42.684 29.743 1.00 30.46 O 

43.723 31.348 1.00 34.53 N 

40.694 27.282 1.00 15.28 C 

41.694 26.722 1.00 10.15 O 

39.994 26.786 1.00 13.61 N 

40.393 25.544 1.00 16.01 C 

39.453 25.234 1.00 14.76 C 

40.427 24.363 1.00 17.13 C 

41.387 23.598 1.00 14.51 O 

39.391 24.229 1.00 16.19 N 
39.329 23.124 1.00 17.75 C 
37.947 22.476 1.00 15.30 C 
37.599 22.138 1.00 15.11 C 

38.482 21.415 1.00 16.28 C 
38.186 21.101 1.00 13.92 C 
36.395 22.548 1.00 12.95 C 
36.086 22.238 1.00 12.38 C 
36.990 21.512 1.00 13.61 C 
36.705 21.184 1.00 13.98 O 
39.653 23.461 1.00 14.14 C 
39.546 22.604 1.00 16.16 O 
40.057 24.701 1.00 14.82 N 

40.392 25.105 1.00 16.43 C 
39.207 24.933 1.00 16.44 C 
39.340 24.439 1.00 17.81 O 
38.046 25.361 1.00 14.56 N 
36.789 25.258 1.00 15.86 C 
35.686 24.778 1.00 15.44 C 

35.807 23.415 1.00 19.21 C 
34.805 23.331 1.00 18.33 C 
35.553 22.311 1.00 21.44 C 
36.350 26.586 1.00 16.50 C 

36.808 27.650 1.00 16.26 O 
35.447 26.504 1.00 17.69 N 
34.911 27.684 1.00 15.79 C 
35.312 27.729 1.00 21.75 C 
36.700 28.298 1.00 30.60 C 
36.960 28.493 1.00 37.51 C 
38.213 29.199 1.00 47.17 N 
39.405 28.790 1.00 49.75 C 

39.516 27.672 1.00 52.12 N 
40.488 29.500 1.00 50.23 N 
33.398 27.640 1.00 15.02 C 
32.787 26.574 1.00 14.80 O 
32.800 28.799 1.00 13.59 N 
31.356 28.909 1.00 16.64 C 
30.969 30.316 1.00 17.26 C 
31.449 30.714 1.00 22.61 C 
30.324 30.120 1.00 24.38 S 
28.828 30.998 1.00 21.48 C 
30.739 28.706 1.00 16.31 C 

31.255 29.225 1.00 17.83 O 
29.656 27.942 1.00 14.71 N 

29.019 27.740 1.00 13.74 C 
28.044 26.559 1.00 16.62 C 
27.130 26.622 1.00 15.86 C 
28.835 25.247 1.00 17.95 C 
27.982 24.009 1.00 26.87 C 

28.256 29.049 1.00 15.69 C 
27.469 29.438 1.00 13.45 O 

28.483 29.727 1.00 17.74 N 
27.812 30.998 1.00 23.56 C 
26.728 31.019 1.00 26.44 C 
26.088 32.052 1.00 29.65 .0 
28.841 32.073 1.00 24.55 C 
29.497 31.701 1.00 27.19 O 
29.877 32.211 1.00 28.36 C 

26.517 29.901 1.00 27.09 N 
25.494 29.866 1.00 33.19 C 
25.008 28.447 1.00 33.29 C 
25.765 27.484 1.00 34.57 O 

26.020 30.477 1.00 33.38 C 
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0 


HOH W 


180 


25.397 


29 


.865 


14.090 


i 


.00 


19.76 


s 


0 


ATOM 


1462 


0 


HOH W 


182 


11.078 


20.731 


43.859 


i 


.00 


19.77 


s 


0 


ATOM 


1463 


0 


HOH W 


184 


30.825 


30 


.779 


39.402 


i 


.00 


19.77 


s 


0 


ATOM 


1464 


0 


HOH 


w 


187 


10.289 


21.108 


7.474 


1.00 


19.75 


s 


0 


ATOM 


1465 


0 


HOH W 


189 


27.314 


38 


.906 


38.135 


i 


.00 


19.76 


s 


0 


ATOM 


1466 


0 


HOH 


w 


197 


25.884 


26 


.959 


11.320 


i 


.00 


19.70 


s 


0 


ATOM 


1467 


0 


HOH 


w 


209 


9.364 


16 


.866 


38.731 


i 


.00 


19.73 


s 


0 


ATOM 


1468 


0 


HOH 


w 


219 


32.352 


16 


.134 


38.786 


i 


.00 


19.73 


s 


0 


ATOM 


1469 


0 


HOH 


w 


221 


15.972 


35 


.898 


37.609 


i 


.00 


19.69 


s 


0 


ATOM 


1470 


0 


HOH 


w 


223 


3.319 


35 


.758 


13.483 


i 


.00 


19.71 


s 


0 


TER 


1471 




HOH 


w 


223 




















END 































40 



The surface accessible residues of ASP were determined from the crystallographic 
coordinates provided above, using the program DS Modeling (Accelrys), using the default 
45 settings. The total surface accessibility (SA) for ASP was found to be 8044.777 Angstroms. 
Table 19-2 provides the total SA, side chain SA, and percent SAS is the percentage of an 
amino acid's total surface that is accessible to solvent. 



Table 19-2. Total Surface Accessibility of ASP 



50 



55 



60 



Residue 

asp 1:Phe 
asp 2:Asp 
asp 4:lle 
asp 7:Asn 
asp 8:Ala 
asp 10:Thr 
asp 11:lle 
asp 12:Gly 
asp 13:Gly 
asp 14:Arg 
asp 15:Ser 
asp 16:Arg 
asp 22:Ala 
asp 24:Asn 
asp 25:Gly 
asp 32:His 



Total SA ang' 

89.992 

85.970 

17.921 

40.541 

41.497 

35.846 

29.424 

81.658 

75.236 

124.289 

29.424 

105.411 

11.690 

71.105 

53.190 

34.693 



SideChain SA ang 

66.420 
68.625 
12.076 
40.541 
24.153 
35.846 
18.114 
30.191 
18.114 
124.289 
29.424 
88.447 
0.000 
65.067 
30.191 
17.728 



Percent SAS 

36.954 
48.199 
9.714 
21.246 
35.259 
21.190 
17.028 
73.513 
67.615 
55.664 
19.554 
38.127 
9.932 
47.079 
43.325 
19.568 
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asp 34:Gly 


18.114 


12.076 


20.656 




asp 35:Arg 


177.087 


171.242 


69.918 




asp 36:Thr 


87.506 


64.886 


45.401 




asp 37:Gly 


58.465 


24.153 


55.659 


5 


asp 38:Ala 


18.114 


12.076 


16.195 




asp 39:Thr 


99.579 


87.889 


55.002 




aso 40'Thr 


11.310 


0.000 


6.469 




asD 41 *Ala 


36.229 


36.229 


38.182 




aso 49*A<?n 


86.537 


74.844 


43.919 


m 

IU 


aso 43* Pro 


6.038 


0.000 


4.599 




dop *t*t. i hi 


111 .082 


99.582 


59.375 




aso 45*Glv 


6.038 


6.038 


5.436 




acn Afi'Thr 
a op *KJ. Illl 




52.427 


28.958 




pen 47*Php 


5.655 


0.000 


2.715 


lb 


acn AA'Ala 
dop HO.nld 


co 040 


30.191 • 


52.705 






1 2.076 


12.076 ' 


12.937 




asp ou.oer 




U.wW 


07 OAQ 
Or .W*t57 




acn R1 "Qor 
dop sj 1 .oei 


17.348 


17.348 


11.573 




asp o^.rne 


O.C.UHU 


ip 07c 


pc 004 


20 


asp OO.riQ 


CO 1QO 


qc opq 


4n R1 1 




Ann C /i * f^M\ / 

asp o4.oiy 


1Q1 

OU. 1 57 1 


°.n 1Q1 

Ou. 1 0 1 


P7 P74 




/ton CC* Af>n 

asp oo./\sn 


OA AQQ 


04 4QQ 


ift fin 

1 0.D 1 0 




nun C7«Ti #r 

asp o / . i y • 


OR CCQ 


OQ fiCG 


11 ftRI 
I I .OO I 




Ann CO'DImn 

asp oy.rne 


1 ft 114 

lO.l 1**- 


1ft 1 14 
10. 1 l*r 


q ooft 
y.ouo 


25 


asp 61:Arg 


1Afi 7fln 


1A1 nci 


co APQ 




asp Oil. 1 III 


P? 81Q 








nnn fiO.^^lfcJ 

asp oo.iaiy 


17 COD 


8 nsfi 

O.UOO 


17 848 




asp o**.Mia 


119 PPQ 


8n i£M 


qn C84 




asp oo.uiy 


7fJ cqr 


sn 1Q1 

OU. 1 1 




art 
30 


acn fifi-Val 

asp oo.vai 


18 QfiS 


0 nnn 


1 0.967 




asp D/^sn 


ftQ noo 


ftp Qft4 


fiQP 




nnn CQ>| nt i 

asp do. Leu 


04 COO 
OH. QUO 


8 n*vi 

o.uoo . 


18 RA8 




asp 69:Leu 


AO Pfi7 


AO PR7 


OO PQC 




asp / 1 .uin 


OQ 774 


on 774 
0*7. / ff 


1ft <vRP 


35 


asp 73:Asn 


17 OAR 


17 OAR 
I f .OHO 


ft 7fin 

O./DU 




asp / 4. Asn 


A1 001 


41 *VI1 
*r 1 .OU 1 


pc oei 




nnn 7C«Tt/I' 

asp / o. i yr 


qq C44 


47.922 


37.830 




nnn TC'Cnr 

asp /o.oer 


Q7 888 


52.044 


76.965 




n.nn T7*f^\\t 

asp / r .vaiy 


A1 97 c 


P4 1co 

CM. IOO 


73.294 


40 


asp fO.Oiy 


17 QP1 

1 # .9^ 1 


12.076 


18.067 




dop /o.r\jy 


139.911 


94.292 


56.632 




acn RO*Vnl 


36.229 


30.191 


22.621 




acn ft1'f5!n 

dop O 1 >13U 1 


82.421 


70,921 


37.295 




acn AS'Ala 
a op w^va 


41.117 


24.153 


33.386 


45 


acn fl4*Glv 


12.076 


12.076 


12.151 




asn 85-His 


71 .298 


65.454 


36.451 




acn 86*Thr 


111.082 


93.544 


65.517 




aso B7*Ala 


64.886 


42.267 


52.523 




acn ftfl'Ala 


12.076 


6.038 


10.760 


50 


aso 89* Pro 


90.572 


78.496 


58.405 




asn 90'Val 


94.694 


66.420 


53.062 




asp 91 :Gty 


58.082 


18.114 


49.593 




aso 92*Ser 


34.886 


23.003 


27.450 




asD 93*Ala 


83.381 


60.381 


70.846 


55 


asp 95:Cys 


26.565 


26.565 


15.773 




asp 99:Ser 


39.584 


0.000 


29.907 




asp 100:Thr 


87.123 


47.155 


48.121 




asp 101:Thr 


34.696 


6.038 


22.060 




asp 102:Gly 


12.076 


12.076 


13.771 


60 


asp 103:Trp 


70.728 


47.919 


27.630 




asp 104:Hfe 


47.726 


41.687 


23.152 




asp 105:Cys 


54.609 


31.799 


33.796 




asp 106:Gly 


23.386 


12.076 


23.531 




asp 107:Thr 


47.155 


47.155 


29.873 


65 


asp 108:lle 


5.655 


0.000 


2.888 




asp 109tThr 


64.503 


30.191 


35.741 




asp 110: Ala 


24.153 


24.153 


21.668 




asp 111:Leu 


71.115 


48.305 


36.142 




asp 112-Asn 


138.770 


104.841 


66.301 


70 


asp 113:Ser 


17.731 


11.693 


12.794 




asp 114:Ser 


92.391 


5^427 


63.987 




asp 115:Val 


30.191 


24.153 


18.166 
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acn • 
ClO|J 


1 IfiThr 

1 1 O, II 11 


128.237 


82.618 


66.534 




asp 


117:Tyr 


35.846 


24.153 


15.603 




8Sp 


11 8: Pro 


159.964 


102.648 


93.188 




ocn 


11Q*Glu 


132.745 


87.123 


63.766 


c 


abp 




18.114 


18.114 


20.61 1 




3Sp 


I d. I . I fir 


93.924 


76.579 


48.828 




asp 


1oo. A rn 


129.748 


129.748 


59.619 




sen 
aop 




29.231 


12.076 


26.315 




asp 




6.038 


6 038 

u.vou 


3.084 


10 


asp 


1 97* Am 


99.943 


99.943 


36.957 




asp 


1 OQ-Thr 

i do. i nr 




o nnn 


^ 450 




asp 


1 29:Thr 


7fi C70 

/D.o/ y 


CQ pi C 


4C OIQ 




asp 


A OA.\/nl 


n nnn 
u.uuu 


n nnn 
u.uuu 


n nnn 
u.uuu 




asp 


i3i:oys 


oc cftA 


1Q 70O. 


1ft ftft.^ 


15 


asp 


1 3Z. Aia 


11 ftQO. 


a n**ft - 

O.UOO 


O.Hog 




asp 


A OO.Oli i 

loo.olU 


At\ 7 Oil 


9Q H/11 


on nc.7 




asp 


lOH.rTO 


AAA C0 1 
1 14.501 


mo RAP. 


CO QQ4 




asp 


135:Gly 


1 1 QQO 
1 I.OOO 


a noo 


11 Q70 

ii.y/y 




asp 


137:Ser 


5.555 


C CCC 
O.DOO 


O Q1 C 


20 


asp 


143: Ala 


17.73 1 


c rvoD 
O.UOO 


1 D 7fiO 
IO./ DO 




asp 


I44.wiy 


CQ RIO 


Oft 990 


eo coo 




asp 
asp 


1 45: Asn 


Q1 DQO 


70 1A9 






146:GIn 


52.810 


52.8 lO 


97 cm 
2/.51U 




asp 


147:Ala 


5.655 


n nnn 
U.UUU 


A 7Q7 

t A,f\3l 


25 


asp 


148:Gln 


1 1.5UU 


C £MC 

5.545 


C OOC 

O.OOO 




asp 


152:Ser 


5.655 


0.000 


a noo 

4.uy2 




asp 
asp 


153:Giy 


24.153 


18.114 


or oao 

25.31 9 




154:Gly 


63.927 


12.076 


o4.o22 




asp 


155:Ser 


88.656 


-yr\ cA + 

70.541 


CQ QC/1 


30 


asp 


156:Gly 


52.807 


AO A 4 A 

18.114 


cn nnn 

bU.UyU 




asp 


157: Asn 


35.263 


or OOO 

35.2o3 


2U.1W5 




asp 


158:Cys 


34.312 


O AID 

6.038 • 


01 QOO 

21 .owo 




asp 


159:Arg 


199.716 


154.094 


7Q nnn 




asp 


160:Thr 


135.044 


on a no 
89.422 


o5.oo2 


35 


asp 


161:Gly 


35.462 


CM 4 CO 

24.153 


oo ftao 




asp 


162:Gly 


rj4 C7C 

23.57b 


ft noo 
O.Uoo 


91 OOC 




asp 
asp 


•4 CQ.TVr 

loo.i nr 


4 a one 


ar nnc 

•H3.UUO 


oc 400 




164:Tnr 


c ccc 

5.555 


C ftCC 


O. 1Q7 




asp 


loo.rne 


OA 1CO 


OA 1RO 


in fififl 


40 


asp 


1 67:Gln 


c o>tc 
5.545 


C IMC 
5.545 


O. fU19 




asp 


168: Pro 


/io one 
43.305 


aq one 
4o.oU5 


0.1 007 

Ol .CULI 




asp 


A 7A<Ae>n 

1 /o.Asn 


co no9 
5y.Uo<£ 


CO 077 
DO.O/ f 


*W ftft? 
O 1 .OOfc 




asp 


i7i:Pro 


CO C4C 


AO 9fi7 


49 n97 




asD 


173:Leu 


17.731 


12.076 


8.274 


45 


asp 


174:Gln 


145.572 


122.569 


80.497 




asp 


175: Ala 


52.044 


6.038 


44.291 




asp 


176:Tyr 


64.886 


36.229 


29.811 




asp 


177:Gly 


69.775 


24.153 


70.340 




asp 


178:Leu 


11.693 


6.038 


5.788 


50 


asp 


179:Arg 


182.932 


. 182.932 


72.390 




asp 
asp 


180:Met 


34.886 


12.076 


17.253 




181:lle 


36.229 


30.191 


19.053 




asp 
asp 


18£7hr 


99.389 


76.579 


60.785 




183:Thr 


104.854 


93.544 


68.979 


55 


asp 


184: Asp 


122.008 


23.386 


52.822 



The ASP co-ordinates, and those of homologous structures were loaded into MOE 
so (Chemical Computing Group). Co-ordinates for waters and ligands were removed. Using 
MOE align, the structures were aligned using actual secondary structure, with structural 
alignment enabled and superpose chains enabled. This resulted in the following structural 
alignment. The numbers indicated refer to the mature ASP protease amiho-acid sequence. 

65 PDB ID 
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ASP 
1HPG 
1SGP 
5 1TAL 
2SFA 
2SGA 



1 10 20 30 40 

FDVTGGNAYTIG -GRSRC S IGFAVN GGFITAGHCGRTGATTAN PTGTFA 

— VLGGGAI YGG -G SR- C S AAFNVTK-GGARYFVTAGHCTNI SANWSASS - GG SWGVRE 

- - 1 SGGDAIYSS-TGR-CSLGFNVRS-GSTYYFLTAGHCT^ 

ANIVGG I EYSINNASL - C SVGFSVTR-GATKGFVTAGHCGTVNATARIG GAWGTFA 

--IAGGEAIYAAGGGR-CSLGFlWRSSSGATYALTAGHCTEIASTVm'NSGQTSLLGTRA 

— IAGGEAITT-GGSR-CSLGFNVSV-NGVAHALTAGHCTNISASWS IGTRT 



PDB ID 

10 

ASP 
1HPG 
1SGP 
1TAL 
15 2SFA 
2SGA 



20 



25 



30 



PDB ID 

ASP 

1HPG 

1SGP 

ITAIi 

2SFA 

2SGA 

PDB ID 

ASP 

1HPG 

1SGP 

1TAL 

2SFA 

2SGA 



50 60 70 80 90 100 

GSSFPGOTYAFVRTGAG-VNLIjAQVNNYSGGRVQVAGHTAAPVGSAVCRSGSTTGWHCGT 
GTSFPTNDYGIVRYTDG-SSPAGTVDLYNGSTQDISSAANAWGQAIKKSGSTTKVTSGT 

GS S FPNNDYG IVRYTNTTI PKDGTVG GQDITSAANATVGMAVTRRGSTTGTHSGS 

ARWPGNDRAWSLTSA-QTLLPRVANG-SSFVTVRGSTEAAVGAAVCRSGRTTGYQCGT 
GTSFPGNDYGLIRHSNA-SAAIX5RVYLYNGSYRDITGAGNAWGQTVQRSGSTTGLHSGR 
GTS FPNNDYGI I RHSNP - AAADGRVYLYNGSYQDITTAGNAFVGQAVQRSGSTTGLRSGS 



110 120 130 140 150 160 

ITALNSSVTYPE-GTVRGLIRTTVCAEPGDSGGSLLA-GNQAQGVTSGGSG NCRT 

VTAVNVTVNYGD-GPVYNMVRTTACSAGGDSGGAHFA-GSVALGIHSGSSG CSG 

VTALNATVNYGGGDVVYGMIRTNVCAEPGDSGGPLYS-GTRAIGLTSGGSG NCSS 

ITAKNVTANYAE -GAVRGLTQGNACMGRGDSGGSWITSAGQAQGVMSGGNVQSNGNNCG I 

VTGLNATVNYGGGDIVSGLIQTOVCAEPGDSGGALFA-GSTALGLTSGGSG NCRT 

VTGLNATVNYG S SG IVYGMI QTNVCAQPGDSGGSLFA-G STALGLTSGGSG NCRT 



170 180 

G GTTFFQPVNPILQAYGLRMITTD (SEQ ID NO: 624) 

TA — G SAI HQPVTEALS AYGVTVY (SEQ ID NO: 625) 

G— GTTFFQPVTEALVAYGVSVY— (SEQ ID NO: 626) 

PASQRSSLFERLQPILSQYGLSLVTG- (SEQ ID NO: 627) 

G GTTFFQPVTEALSAYGVS TL (SEQ ID NO: 62 8) 

G GTTFYQPVTEALSAYGATVL (SEQ ID NO: 62 9) 



In the above alignment, the codes are as follows: 



40 



45 



1 HPG = Streptomyces griseus glutamic acid specific protease. 
1SGP = Streptomyces griseus proteinase B 
1SGT = Streptomyces griseus strain K1 trypsin 
1TAL = Lysobacter enzymogenes alpha-lytic protease 
2SFA = Streptomyces f radiae serine proteinase 
2SGA = Streptomyces griseus protease A 



EXAMPLE 20 

so Enzyme Substrate Modeling and Mapping of the ASP Active-Site 

In this Example, enzyme-substrate modeling and mapping of the ASP active site 
methods are described. Preliminary inspection of the active-site revealed a large P1 binding 
pocket that is large enough to accommodate large hydrophobic groups such as the side- 
chains of Trp, Tyr, and Phe. 
55 The crystal structure of Streptogrisin A with the turkey third domain of the ovomucoid 

inhibitor (pdb code 2SGB) was been determined. 2SGB was structurally aligned to ASP, 
using MOE (Chemical Computing Corp), which places the inhibitor in the active-site of ASP. 
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All of the 2SGB co-ordinates were removed, except for those which define a hexa-peptide 
bound in the ASP active-site, corresponding to binding at the S4 to S2' binding sites. The 
Pro-ASP protein self-cleaves the pro domain-mature domain junction, to release the mature 
protease enzyme. The last four residues of the pro domain are expected to occupy the S1- 
S4 sites, and the first two residues of the mature protease occupy the S1* and S2' sites. 
Therefore the hexapeptide in the active-site was in-silico mutated to sequence PRTMFD 
(SEQ ID NO:630). 

From inspection of the structure of the initial substrate bound model, the backbone 
amide of Gly135 and Asp136 would be expected to form the oxy-ahion hole. However, the 
amide nitrogen of Gly135 appears to point in the wrong direction. Comparison with 
streptogrisin A confirms this. Thus, it is presumed that a conformational change in ASP is 
required to form the oxy-anion hole. However, it is not intended that the present invention 
be limited by any particular mechanism nor hypothesis. The peptide backbone between 
residues 134 and 135 was altered to that of a similar orientation to that of structurally 
equivalent atoms in the streptogrisin A structure. The enzyme substrate model was then 
energy minimized. 

Residues within 6 A of the modeled substrate were determined using the proximity 
tools within the program QUANTA. These residues were identified as: Arg14, Ser15, 
Arg16, Cys17, His32, Cys33, Phe52, Asp56, Thr100, Val115, Thr116, Tyr117, Pro118, 
Glu119, Ala132, Glu133, Pro134, Gly135, Asp136, Ser137, Thr151, Ser152, Gly153, 
Gly154, Ser1 55, Gly156, Asn157, Thr164, Phe165. Of these, His 32, Asp56, and Ser137 
form the catalytic triad. 

The P1 pocket is formed by Cys131, Ala132, Glu133, Pro134, Gly135, Thr151, 
Ser152, Gly153, Gly154, Ser155, Gly156, Asn157 and Gly 162, Thr 163, Thr164. The P2 
pocket is defined by Phe52, Tyr1 17, Pro1 18 and Glu1 19. The P3 pocket has main-chain to 
main chain hydrogen bonding from Gly 154 to the substrate main-chain. The P1' pocket is 
defined by Arg16, and His32. The P2' pocket is defined by ThrlOO, and Pro134. The 
atomic coordinates of ASP with the modeled octapeptide substrate are provided in Table 20- 
1 below. 



Table 20-1. Atomic Coordinates of ASP with the Modeled Octapeptide Substrate 



ATOM 


1 


N 


PHE 


A 


1 


2.452 


18.495 


15.165 


0.00 


N1+ 


ATOM 


2 


CA 


PHE 


A 


1 


3.712 


18.208 


15.901 


0.00 


C 


ATOM 


3 


CB 


PHE 


A 


1 


4.906 


18.646 


15.055 


0.00 


C 


ATOM 


4 


C 


PHE 


A 


1 


3.743 


16.914 


17.254 


0.00 


C 


ATOM 


5 


O 


PHE 


A 


1 


3.539 


20.133 


17.340 


0.00 


O 


ATOM 


6 


CG 


PHE 


A 


1 


6.232 


18.405 


15.707 


0.00 


C. 


ATOM 


7 


CD2 


PHE 


A 


1 


6.963 


17.268 


15.411 


0.00 


C 
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ATOM 


8 


CD1 


. PRE A 


1 


6.750 




ATOM 


9 


CE2 


1 PHE A 


1 


8.192 




ATOM 


10 


CE1 


. PHE A 


1 


7.981 




ATOM 


11 


CZ 


PHE A 


1 


8.702 


5 


ATOM 


12 


N 


ASP A 


2 


4.000 




ATOM 


13 


CA 


ASP A 


2 


4.052 




ATOM 


14 


CB 


ASP A 


2 


3.584 




ATOM 


15 


C 


ASP A 


2 


5.422 




ATOM 


16 


O 


ASP A 


2 


6.415 


10 


ATOM 


17 


CG 


ASP A 


2 


2.109 




ATOM 


18 


OD2 


ASP A 


2 


1.597 




ATOM 


19 


OD1 


ASP A 


2 


1.459 




ATOM 


20 


N 


VAL A 


3 


5.464 




ATOM 


21 


CA 


VAL A 


3 


6.707 


15 


ATOM 


22 


CB 


VAL A 


3 


6.736 




ATOM 


23 


c 


VAL A 


3 


6.737 




ATOM 


24 


0 


VAL A 


3 


5.806 




ATOM 


25 


CGI 


VAL A 


3 


7.921 




ATOM 


26 


CG2 


VAL A 


3 


6.840 


20 


ATOM 


27 


CB 


ILE A 


4 


7 . 602 




ATOM 


28 


CG2 


ILE A 


4 


7 .684 




ATOM 


29 


CGI 


ILE A 


4 


6.196 




ATOM 


30 


CD1 


ILE A 


4 


5.768 




ATOM 


31 


c 


ILE A 


4 


9 .379 


25 


ATOM 


32 


0 


ILE A 


4 


10.346 




ATOM 


33 


N 


ILE A 


4 


7 . 801 




ATOM 


34 


CA 


ILE A 


4 


7 . 955 




ATOM 


35 


N 


GLY A 


5 


9 .499 




ATOM 


36 


CA 


GLY A 


5 


10. 807 


30 


ATOM 


37 


c 


GLY A 


5 


11 . 655 




ATOM 


38 


o 


GLY A 


5 


11 .171 




ATOM 


39 


N 


GLY A 


6 


12 . 927 




ATOM 


40 


CA 


GLY A 


6 


13 . 817 




ATOM 


41 


c 


GLY A 


6 


14 . 007 


35 


ATOM 


42 


o 


GLY A 


6 


14 . 990 




ATOM 


43 


N 


ASN A 


7 


13 . 069 




ATOM 


44 


CA 


ASN A 


7 


13 .155 




ATOM 


45 


CB 


ASN A 


. 7 


11 .784 




ATOM 


46 


CG 


ASN A 


7 


10 .918 


40 


ATOM 


47 


obi 


ASN A 


7 


9 .741 




ATOM 


48 


ND2 


ASN A 


7 


11 .492 




ATOM 


49 


c 


ASN A 


7 


14 .124 




ATOM 


50 


0 


ASN A 


7 


14 .466 




ATOM 


51 


N 


ALA A 


8 


14.561 


45 


ATOM 


52 


CA 


ALA A 


8 


15.486 




ATOM 


53 


CB 


ALA A 


8 


16.212 




ATOM 


54 


C 


ALA A 


8 


14.716 




ATOM 


55 


0 


ALA A 


8 


13.509 




ATOM 


56 


N 


TYR A 


9 


15.423 


50 


ATOM 


57 


CA 


TYR A 


9 


14.847 




ATOM 


58 


CB 


TYR A 


9 


14.253 




ATOM 


59 


CG 


TYR A 


9 


15.221 




ATOM 


60 


CD2 


TYR A 


9 


15.517 




ATOM 


61 


CE2 


TYR A 


9 


16.341 


55 


ATOM 


62 


CD1 


TYR A 


9 


15.785 




ATOM 


63 


CE1 


TYR A 


9 


16.609 




. ATOM 


64 


CZ 


TYR A 


9 


16.883 




ATOM 


65 


OH 


TYR A 


9 


17.688 




ATOM 


66 


C 


TYR A 


9 


16.072 


60 


ATOM 


67 


O 


TYR A 


9 


17.188 




ATOM 


68 


N 


THR A 


10 


15.886 




ATOM 


69 


CA 


THR A 


10 


17.034 




ATOM 


70 


CB 


THR A 


10 


17.031 




ATOM 


71 


OG1 


THR A 


10 


15.822 


65 


ATOM 


72 


CG2 


THR A 


10 


17.129 




ATOM 


73 


C 


THR A 


10 


17.205 




ATOM 


74 


O 


THR A 


10 


16.249 




ATOM 


75 


N 


ILE A 


11 


18.453 




ATOM 


76 


CA 


ILE A 


11 


18.828 


70 


ATOM 


77 


CB 


ILE A 


11 


19.609 




ATOM 


78 


CG2 


ILE A 


11 


19.855 




ATOM ' 


79 


CGI 


ILE A 


11 


18.811 




ATOM 


80 


CD1 


ILE A 


11 


19.546 




ATOM 


81 


C 


ILE A 


11 


19.712 


75 


ATOM 


82 


O 


ILE A 


11 


20.772 




ATOM 


83 


N 


GLY A 


12 


19.274 




ATOM 


84 


CA 


GLY A 


12 


20.048 




ATOM 


85 


C 


GLY A 


12 


20.344 
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19.312 16.618 0.00 c 

17.035 16.010 0.00 c 

19.086 17.222 0.00 c 

17.946 16.917 0.00 c 

18.148 18.311 0.00 N 

18.708 19.659 0.00 c 

17.678 20.688 0.00 C 

19.210 20.066 0.00 C 

18.508 19.925 0.00 o 

17.354 20.560 0.00 C 

16.558 21.379 0.00 01- 

17.889 19.638 0.00 o 

20.440 20.562 0.00 N 

21.057 21.009 * 0.00 C 

22.574 20.718 0.00 C 

20.837 22.513 0.00 C 

21.233 23.216 0.00 O 

23.222 21.425 0.00 C 

22.810 19.220 0.00 C 

18.448 24.730 0.00 C 

18.189 26.227 0.00 C 

18.137 24.220 0.00 C 

16.711 24.456 0.00 C 

20.168 24.911 0.00 C 

19.836 24.229 0.00 O 
20.200 22.997 0.00 N 
19.916 24.423 0.00 C 
20.743 26.103 0.00 N 

21.030 26.653 0.00 C 
19.787 26.819 0.00 C 
18.750 27.277 0.00 O 
19.885 26.443 0.00 N 
18.747 26.572 0.00 C 
17.948 25.294 0.00 C 
17.217 25.157 0.00 O 
18.082 24.359 0.00 N 
17.351 23.100 0.00 C 
17.247 22.450 0.00 C 
16.210 23.102 0.00 C 
16.069 22.760 0.00 O 
15.464 . 24.049 0.00 N 
17.933 22.086 0.00 C 
19.114 22.119 0.00 O 
17.077 21.176 0.00 N 

17.487 20.138 0.00 C 
16.271 19.577 0.00 C 
18.174 19.023 0.00 C 
17.988 18.874 0.00 O 
18.993 18.262 0.00 . N 
19.714 17.143 0.00 C 
21.064 17.580 0.00 C 
22.148 17.963 0.00 C 
22.398 19.301 0.00 C 
23.443 19.663 0.00 C 
22.972 16.993 0.00 C 
24.021 17.343 0.00 C 
24.255 18.678 0.00 C 
25.309 19.029 0.00 O 

19.837 16.262 0.00 C 
19.678 16.753 0.00 .0 
20.077 14.970 0.00 N 
20.183 14.082 0.00 C 

19.031 13.041 0.00 C 
19.082 12.269 0.00 O 
17.676 13.741 0.00 C 

21.488 13.329 0.00 C 
22.243 13.104 0.00 O 
21.734 12.938 0.00 N 
22.930 12.197 0.00 C 
23.914 13.093 0.00 C 
25.221 12.343 0.00 C 
24.187 14.369 0.00 C 
25.036 15.385 0.00 C 
22.442 11.054 0.00 C 
21.856 11.284 0.00 0 
22.668 9.821 0.00 N 
22.193 8.689 0.00 C 
20.705 8.845 0.00 C 
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ATOM 


86 


O 


GLY 


A 


12 


21.439 


20.239 


8.523 


0.00 


0 




ATOM 


87 


N 


GLY 


A 


13 


19.373 


19.957 


9.361 


0.00 


N 




ATOM 


88 


CA 


GLY 


A 


13 


19.564 


18.531 


9.545 


0.00 


c 




ATOM 


89 


C 


GLY 


A 


13 


20.373 


18.127 


10.769 


0.00 


c 


5 


ATOM 


90 


O 


GLY 


A 


13 


20.438 


16.945 


11.103 


0.00 


0 




ATOM 


91 


N 


ARG 


A 


14 


20.984 


19.091 


11.449 


0.00 


N 




ATOM 


92 


CA 


ARG 


A 


14 


21.787 


18.782 


12.627 


0.00 


c 




ATOM 


93 


CB 


ARG 


A 


14 


23.036 


19.670 


12.669 


0.00 


C 




ATOM 


94 


C 


ARG 


A 


14 


21.018 


18.938 


13.935 


0.00 


c 


10 


ATOM 


95 


O 


ARG 


A 


14 


20.441 


19.982 


14.212 


0.00 


0 




ATOM 


96 


CG 


ARG 


A 


14 


24.251 


19.072 


11.964 


0.00 


C 




ATOM 


97 


CD 


ARG 


A 


14 


24.065 


19.084 


10.450 


0.00 


c 




ATOM 


98 


NE 


ARG 


A 


14 


24.173 


17.752 


9.858 


0.00 


N1 + 




ATOM 


99 


CZ 


ARG 


A 


14 


25.316 


17.100 


9.660 


0.00 


c 


15 


ATOM 


100 


NH1 


ARG 


A 


14 


26.474 


17.655 


10.004 


0.00 


N 




ATOM 


101 


NH2 


ARG 


A 


14 


25.302 


15.886 


9.120 


0.00 


N 




ATOM 


102 


N 


SER 


A 


15 


21.016 


17.878 


14.733 


0.00 


N 




ATOM 


103 


CA 


SER 


A 


15 


20.335 


17.870 


16.017 


0.00 


c 




ATOM 


104 


CB 


SER 


A 


15 


20.062 


16.429 


16.454 


0.00 


c 


20 


ATOM 


105 


C 


SER 


A 


15 


21.312 


18.525 


16.983 


0.00 


c 




ATOM 


106 


0 


SER 


A 


15 


21.933 


17.849 


17.803 


0.00 


0 




ATOM 


107 


OG 


SER 


A 


15 


19.396 


16.382 


17.701 


0.00 


o 




ATOM 


108 


N 


ARG 


A 


16 


21.454 


19.841 


16.867 


0.00 


N 




ATOM 


109 


CA 


ARG 


A 


16 


22.362 


20.594 


17.724 


0.00 


c 


25 


ATOM 


110 


CB 


ARG 


A 


16 


22.741 


21.927 


17.073 


0.00 


c 




ATOM 


111 


C 


ARG 


A 


16 


21.815 


20.907 


19.104 


0.00 


c 




ATOM 


112 


0 


ARG 


A 


16 


22.550 


20.867 


20.088 


0.00 


0 




ATOM 


113 


CG 


ARG 


A 


16 


23.719 


21.851 


15.915 


0.00 


c 




ATOM 


114 


CD 


ARG 


A 


16 


24.200 


23.253 


15.549 


0.00 


c 


30 


ATOM 


115 


NE 


ARG 


A 


16 


24.625 


23.984 


16.745 


0.00 


N1+ 




ATOM 


116 


CZ 


ARG 


A 


16 


25.242 


25.166 


16.739 


0.00 


c 




ATOM 


117 


NH2 


ARG 


A 


16 


25.581 


25.735 


17.888 


0.00 


N 




ATOM 


118 


NH1 


ARG 


A 


16 


25.528 


25.781 


15.597 


0.00 


N 




ATOM 


119 


N 


CYS 


A 


17 


20.526 


21.215 


19.178 


0.00 


N 


35 


ATOM 


120 


GA 


CYS 


A 


17 


19.928 


21.546 


20.455 


0.00 


c 




ATOM 


121 


CB 


CYS 


A 


17 


19.800 


23.068 


20.553 


0.00 


c 




ATOM 


122 


c 


CYS 


A 


17 


18.599 


20.911 


20.803 


0.00 


c 




ATOM 


123 


O 


CYS 


A 


17 


18.071 


20.077 


20.071 


0.00 


0 




ATOM 


124 


SG 


CYS 


A 


17 


21.393 


23.932 


20.696 


0.00 


s 


40 


ATOM 


125 


N 


SER 


A 


18 


18.066 


21.348 


21.942 


0.00 


N 




ATOM 


126 


CA 


SER 


A 


18 


16.799 


20.865 


22.455 


0.00 


c 




ATOM 


127 


CB 


SER 


A 


18 


17.042 


20.053 


23.723 


0.00 


c 




ATOM 


128 


OG 


SER 


A 


18 


18.081 


19.111 


23.521 


0.00 


0 




ATOM 


129 


C 


SER 


A 


18 


15.871 


22.030 


22.769 


0.00 


c 


45 


ATOM 


130 


O 


SER 


A 


18 


16.312 


23.175 


22.890 


0.00 


0 




ATOM 


131 


M 


ILE 


A 


19 


14.584 


21.728 


22.892 


0.00 


N 




ATOM 


132 


CA 


ILE 


A 


19 


13.582 


22.737 


23.195 


0.00 


c 




ATOM 


133 


CB 


ILE 


A 


19 


12.150 


22.152 


23.125 


0.00 


c 




ATOM 


134 


CG2 


ILE 


A 


19 


11.133 


23.215 


23.532 


0.00 


c 


50 


ATOM 


135 


CGI 


ILE 


A 


19 


11.852 


21.634 


21.715 


0.00 


c 




ATOM 


136 


CD1 


ILE 


A 


19 


11.832 


22.709 


20.655 


0.00 


c 




ATOM 


137 


C 


ILE 


A 


19 


13.794 


23.273 


24.614 


0.00 


c 




ATOM 


138 


O 


ILE 


A 


19 


14.070 


22.505 


25.545 


0.00 


o 




ATOM 


139 


N 


GLY 


A 


20 


13.670 


24.589 


24.774 


0.00 


N 


55 


ATOM 


140 


CA 


GLY 


A 


20 


13.818 


25.185 


26.088 


0.00 


c 




ATOM 


141 


C 


GLY 


A 


20 


12.443 


25.203 


26.722 


0.00 


c 




ATOM 


142 


O 


GLY 


A 


20 


12.122 


24.389 


27.585 


0.00 


o 




ATOM 


143 


N 


PHE 


A 


21 


11.616 


26.137 


26.274 


0.00 


N 




ATOM 


144 


CA 


PHE 


A 


21 


10.253 


26.258 


26.763 


0.00 


c 


60 


ATOM 


145 


CB 


PHE 


A 


21 


10.196 


27.160 


27.992 


0.00 


c 




ATOM 


146 


CG 


PHE 


A 


21 


10.855 


26.559 


29.195 


0.00 


c 




ATOM 


147 


CD1 


PHE 


A 


21 


10.269 


25.491 


29.857 


0.00 


c 




ATOM 


148 


CD2 


PHE 


A 


21 


12.086 


27.025 


29.638 


0.00 


c 




ATOM 


149 


CE1 


PHE 


A 


21 


10.898 


24.898 


30.936 


0.00 


c 


65 


ATOM 


150 


CE2 


PHE 


A 


21 


12.713 


26.435 


30.715 


0.00 


c 




ATOM 


151 


CZ 


PHE 


A 


21 


12.122 


25.370 


31.366 


0.00 


c 




ATOM 


152 


c 


PHE 


A 


21 


9.391 


26.825 


25.664 


0.00 


c 




ATOM 


153 


0 


PHE 


A 


21 


9.865 


27.597 


24.830 


0.00 


o 




ATOM 


154 


N 


ALA 


A 


22 


8.131 


26.413 


25.646 


0.00 


N 


70 


ATOM 


155 


CA 


ALA 


A 


22 


7.194 


26.882 


24.647 


0.00 


c 




ATOM 


156 


CB 


ALA 


A 


22 


6.014 


25.915 


24.533 


0.00 


c 




ATOM 


157 


C 


ALA 


A 


22 


6.719 


28.230 


25.138 


0.00 


c 




ATOM 


158 


O 


ALA 


A 


22 


6.416 


28.388 


26.320 


0.00 


0 




ATOM 


159 


N 


VAL 


A 


23 


6.677 


29.202 


24.239 


0.00 


N 


75 


ATOM 


160 


CA 


VAL 


A 


23 


6.233 


30.546 


24.582 


0.00 


c 




ATOM 


161 


CB 


VAL 


A 


23 


7.402 


31.570 


24.551 


0.00 


c 




ATOM 


162 


CGI 


VAL 


A 


23 


8.328 


31.338 


25.728 


0.00 


c 




ATOM 


163 


CG2 


VAL 


A 


23 


8.182 


31.442 


23.248 


0.00 


c 
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ATOM 


164 


C 


VAL 


A 


23 


5.206 


30.945 


23.545 




ATOM 


165 


O 


VAL 


A 


23 


5.053 


30.267 


22.526 




ATOM 


166 


N 


ASN 


A 


24 


4.495 


32.036 


23.791 




ATOM 


167 


CA 


ASN 


A 


24 


3.492 


32.476 


22.832 


5 


ATOM 


168 


CB 


ASN 


A 


24 


2.807 


33.759 


23.328 




ATOM 


169 


C 


ASN 


A 


24 


4.177 


32.715 


21.484 




ATOM 


170 


0 


ASN 


A 


24 


5.050 


33.576 


21.365 




ATOM 


171 


CG 


ASN 


A 


24 


3.737 


34.963 


23.334 




ATOM 


172 


OD1 


ASN 


A 


24 


4.697 


35.029 


24.107 


10 


ATOM 


173 


ND2 


ASN 


A 


24 


3.451 


35.927 


22.462 




ATOM 


174 


N 


GLY 


A 


25 


3.801 


31.929 


20.477 




ATOM 


175 


CA 


GLY 


A 


25 


4.396 


32.084 


19.158 




ATOM 


176 


c 


GLY 


A 


25 


5.503 


31.104 


18.788 




ATOM 


177 


O 


GLY 


A 


25 


5.925 


31.054 


17.635 


15 


ATOM 


178 


N 


GLY 


A 


26 


5.989 


30.327 


19.748 




ATOM 


179 


CA 


GLY 


A 


26 


7.043 


29.377 


19.433 




ATOM 


180 


c 


GLY 


A 


26 


7.702 


28.795 


20.666 




ATOM 


181 


0 


GLY 


A 


26 


7.028 


28.328 


21.582 




ATOM 


182 


N 


PHE 


A 


27 


9.028 


28.813 


20.688 


20 


ATOM 


183 


CA 


PHE 


A 


27 


9.757 


28.294 


21.832 




ATOM 


184 


CB 


PHE 


A 


27 


9.973 


26.783 


21 .710 




ATOM 


185 


c 


PHE 


A 


27 


11.103 


28.975 


21.954 




ATOM 


186 


o 


PHE A 


27 


11.660 


29.459 


20.963 




ATOM 


187 


CG 


PHE 


A 


27 


10.949 


26.376 


20.624 


25 


ATOM 


188 


CD1 


PHE 


A 


27 


10. 504 


26. 078 


19 .336 




ATOM 


189 


CD2 


PHE 


A 


27 


12.306 


26.246 


20.905 




ATOM 


190 


CE1 


PHE 


A 


27 


11 .391 


25. 650 


18 .352 




ATOM 


191 


CE2 


PHE 


A 


27 


13 .202 


25.819 


19 .926 




ATOM 


192 


CZ 


PHE 


A 


27 


12 .742 


25. 518 


18 .648 


30 


ATOM 


193 


N 


ILE 


A 


28 


11 . 615 


29.020 


23 .180 




ATOM 


194 


CA 


ILE 


A 


28 


12 .904 


29 . 640 


23 .445 




ATOM 


195 


CB 


ILE 


A 


28 


12.843 


30. 524 


24 .704 




ATOM 


196 


c 


ILE 


A 


28 


13 .953 


28. 542 


23 .603 




ATOM 


197 


o 


ILE 


A 


28 


13 .640 


27.426 


24 .011 


35 


ATOM 


198 


CG2 


ILE 


A 


28 


11.915 


31. 688 


24 .450 




ATOM 


199 


CGI 


ILE 


A 


28 


12 .350 


29.718 


25.904 




ATOM 


200 


CD1 


ILE 


A 


28 


12 .270 


30.524 


27 .176 




ATOM 


201 


N 


THR 


A 


29 


15.195 


28.866 


23 r .265 




ATOM 


202 


CA 


THR 


A 


29 


16.293 


27.916 


23.353 


40 


ATOM 


203 


CB 


THR 


A 


29 


16.329 


27. 054 


22.052 




ATOM 


204 


OG1 


THR 


A 


29 


17 .423 


26.126 


22.095 




ATOM 


205 


CG2 


THR 


A 


29 


16.459 


27.950 


20.831 




ATOM 


206 


c 


THR 


A 


29 


17.601 


28.695 


23.538 




ATOM 


207 


O 


THR 


A 


29 


17.565 


29.881 


23.842 


45 


ATOM ' 


208 


N 


ALA 


A 


30 


18.743 


28.029 


23.362 




ATOM 


209 


CA 


ALA 


A 


30 


20.059 


28.662 


23.510 




ATOM 


210 


CB 


ALA 


A 


30 


21.121 


27.601 


23.765 




ATOM 


211 


C 


ALA 


A 


30 


20.447 


29.486 


22.282 




ATOM 


212 


O 


ALA 


A 


30 


20.232 


29.061 


21.141 


50 . 


ATOM 


213 


N 


GLY 


A 


31 


21.028 


30.659 


22.520 




ATOM 


214 


CA 


GLY 


A 


31 


21.427 


31.522 


21.423 




ATOM 


215 


C 


GLY 


A 


31 


22.508 


30.942 


20.528 




ATOM 


216 


0 


GLY 


A 


31 


22.527 


31.212 


19.322 




ATOM 


217 


N 


HIS 


A 


32 


23.410 


30.143 


21.099 


55 


ATOM 


218 


CA 


HIS 


A 


32 


24.490 


29.558 


20.310 




ATOM 


219 


CB 


HIS 


A 


32 


25.648 


29.091 


21.215 




ATOM 


220 


CG 


HIS A 


32 


25.412 


27.772 


21.885 




ATOM 


221 


CD2 


HIS 


A 


32 


24.715 


27.451 


23.001 




ATOM 


222 


ND1 


HIS 


A 


32 


25.946 


26.589 


21.419 


60 


ATOM 


223 


CE1 


HIS 


A 


32 


25.590 


25.601 


22.218 




ATOM 


224 
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1.137 


30.133 


31.739 


0. 00 


Q 




ATOM 


1274 


N 


THR A 


183 


-0.448 


26.534 


30.036 


0. 00 


N 


20 


ATOM 


1275 


CA 


THR A 


183 


-1.593 


25.623 


30.045 


0.00 


c 




ATOM 


1276 


c 


THR A 183 


-1.754 


25.043 


28.647 


0.00 


p 




ATOM 


1277 


o 


THR A 


183 


-1 .274 


25.608 


27. 675 


0. 00 


n 




ATOM 


1278 


CB 


THR A 


183 


-2 .909 


26.342 


30.433 


0.00 


Q 




ATOM 


1279 


OGl 


THR A 


183 


-3 .716 


25.460 


31.228 


0. 00 


Q 


25 


ATOM 


1280 


CG2 


THR A 


183 


-3 . 690 


26.738 


29.184 


0 . 00 


p 




ATOM 


1281 


N 


ASP A 


184 


-2.402 


23 . 896 


28.532 


0. 00 


M 
1M 




ATOM 


1282 


CA 


ASP A 


184 


-2.573 


23.318 


27 . 213 


0. 00 


p 




ATOM 


1283 


c 


ASP A 


184 


-4 .035 


23 .091 


26.918 


0. 00 


p 




ATOM 


1284 


o 


ASP A 


184 


-4 .380 


22.208 


26.174 


0. 00 




30 


ATOM 


1285 


CB 


ASP A 184 


-1 . 810 


22. 005 


27.113 


0 . 00 


p 




ATOM 


1286 


CG 


ASP A 


184 


-0 .464 


22.056 


27 .794 


0.00 


p 




ATOM 


1287 


OD1 


ASP A 184 


0.296 


23 .029 


27.577 


0.00 






ATOM 


1288 


OD2 


ASP A 184 


-0.152 


21.080 


28.527 


0.00 






TKR 


1289 




ASP A 184 












35 


ATOM 


1290 


N 


ALA B 


14 


37.553 


22.457 


29 .194 


0.00 


wit 




ATOM 


1291 


H 


ALA B 


14 


36 .582 


22.364 


28.935 


0.00 


tf 
rx 




ATOM 


1292 


H 


ALA B 


14 


37 .991 


23.157 


28.614 


0.00 


XI 

n 




ATOM 


1293 


H 


ALA B 


14 


38.021 


21.572 


29.065 


0.00 






ATOM 


1294 


CA 


ALA B 


14 


37 .649 


22.863 


30.616 


0.00 


Q 


40 


ATOM 


1295 


C 


ALA B 


14 


36 .345 


22 .665 


31.400 


0.00 


Q 




ATOM 


1296 


0 


ALA B 


14 


36.364 


21.816 


32.304 


0.00 


o 




ATOM 


1297 


CB 


ALA B 


14 


38.235 


24.270 


30.658 


0.00 






ATOM 


1298 


N 


ALA B 


15 


35.261 


23.393 


31.094 


0.00 


N 




ATOM 


1299 


CA 


ALA B 


15 


35.165 


24.394 


30.026 


0.00 


c 


45 


ATOM 


1300 


c 


ALA B 


15 


34.368 


23.941 


28.790 


0.00 


c 




ATOM 


1301 


O 


ALA B 


15 


34.957 


23.330 


27.892 


0.00 


o 




ATOM 


1302 


CB 


ALA B 


15 


34.779 


25.773 


30.573 


0.00 


c 




ATOM 


1303 


N 


ALA B 


16 


33.028 


24.069 


28.763 


0.00 


N 




ATOM 


1304 


CA 


ALA B 


16 


32.304 


23.388 


27.683 


0.00 


c 


50 


ATOM 


1305 


C 


ALA B 


16 


31.144 


24.054 


26.918 


0.00 


c 




ATOM 


1306 


O 


ALA B 


16 


30.114 


24.490 


27.453 


0.00 


0 




ATOM 


1307 


CB 


ALA B 


16 


32.420 


21.850 


27.713 


0.00 


c 




ATOM 


1308 


H 


ALA B 


16 


32.544 


24.608 


29.452 


0.00 


H 




ATOM 


1309 


N 


HIS B 


17 


31.370 


24.111 


25.600 


0.00 


N 


55 


ATOM 


1310 


CA 


HIS B 


17 


30.508 


24.676 


24.521 


0.00 


c 




ATOM 


1311 


C 


HIS B 


17 


29.820 


23.558 


23.756 


. 0.00 


c 




ATOM 


1312 


O 


HIS B 


17 


30.487 


22.621 


23.291 


0.00 


0 




ATOM 


1313 


CB 


HIS B 


17 


31.473 


25.545 


23.683 


0.00 


c 




ATOM 


1314 


CG 


HIS B 


17 


30.806 


26.351 


22.601 


0.00 


c 


60 


ATOM 


1315 


ND1 


HIS B 


17 


30.728 


26.028 


21.264 


0.00 


N 




ATOM 


1316 


CD2 


HIS B 


17 


30.170 


27.551 


22.772 


0.00 


c 




ATOM 


1317 


CE1 


HIS B 


17 


30.054 


27.014 


20.648 


0.00 


c 




ATOM 


1318 


NE2 


HIS B 


17 


29.694 


27.965 


21.525 


0.00 


N 




ATOM 


1319 


H 


HIS B 


17 


32.233 


23.710 


25.292 


0.00 


H 


65 


ATOM 


1320 


N 


TYR B 


18 


28.491 


23.661 


23.613 


0.00 


N 




ATOM 


1321 


CA 


TYR B 


18 


27.651 


22.538 


23.244 


0.00 


c 




ATOM 


1322 


C 


TYR B 


18 


26.791 


22.741 


21.978 


0.00 


c 




ATOM 


1323 


O 


TYR B 


18 


25.936 


21.904 


21.762 


0.00 


0 




ATOM 


1324 


CB 


TYR B 


18 


26.869 


22.044 


24.476 


0.00 


c 


70 


ATOM 


1325 


CG 


TYR B 


18 


27.638 


21.257 


25.527 


0.00 


c 




ATOM 


1326 


CD1 


TYR B 


18 


27.073 


20.996 


26.793 


0.00 


c 




ATOM 


1327 


CD2 


TYR B 


18 


28.818 


20.596 


25.160 


0.00 


c 




ATOM 


1328 


CE1 


TYR B 


18 


27.702 


20.099 


27.685 


0.00 


c 




ATOM 


1329 


CE2 


TYR B 


18 


29.420 


19.668 


26.020 


0.00 


c 


75 


ATOM 


1330 


CZ 


TYR B 


18 


28.855 


19.410 


27.276 


0.00 


c 




ATOM 


1331 


OH 


TYR B 


18 


29.519 


18.595 


28.139 


0.00 


0 




ATOM 


1332 


H 


TYR B 


18 


28.022 


24.521 


23.872 


0.00 


H 




ATOM 


1333 


N 


ASP B 


19 


27.328 


23.446 


20.986 


0.00 


N 
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ATOM 


1334 


CA 


ASP 


B 


19 


27. 


.252 


23. 


.065 


19.573 


0.00 


C 


ATOM 


1335 


C 


ASP 


B 


19 


25. 


.957 


22. 


.335 


19.178 


0.00 


C 


ATOM 


1336 


O 


ASP 


B 


19 


24 


855 


22. 


.851 


19.367 


0.00 


0 


ATOM 


1337 


CB 


ASP 


B 


19 


27.381 


24. 


.271 


18.655 


0.00 


C 


ATOM 


1338 


CG 


ASP 


B 


19 


28. 


.399 


25. 


.369 


18.926 


0.00 


c 


ATOM 


1339 


OD1 


ASP 


B 


19 


28.777 


25 


.568 


20.105 


0.00 


0 


ATOM 


1340 


OD2 


ASP 


B 


19 


28, 


.588 


26 


.117 


17.941 


0.00 


01- 


ATOM 


1341 


H 


ASP 


B 


19 


28. 


.092 


24 


.050 


21.252 


0.00 


H 


ATOM 


1342 


N 


GLU 


B 


20 


26 


.024 


21 


.140 


18.622 


0.00 


N 


ATOM 


1343 


CA 


GLU 


B 


20 


27, 


.219 


20 


.341 


18.451 


0.00 


C 


ATOM 


1344 


C 


GLU 


B 


20 


27. 


.848 


20 


.634 


17.079 


0.00 


C 


ATOM 


1345 


0 


GLU 


B 


20 


27, 


.311 


20 


.147 


16.091 


0.00 


0 


ATOM 


1346 


CB 


GLU 


B 


20 


26 


.641 


18 


.934 


18.532 


0.00 


C 


ATOM 


1347 


CG 


GLU 


B 


20 


26 


.790 


18 


.174 


19.836 


0.00 


C 


ATOM 


1348 


CD 


GLU 


B 


20 


26 


.391 


16 


.720 


19.643 


0.00 


C 


ATOM 


1349 


OE1 


GLU 


B 


20 


26 


.614 


16 


.043 


20.673 


0.00 


01- 


ATOM 


1350 


OE2 


GLU 


B 


20 


26 


.569 


16 


.221 


18.501 


0.00 


0 


ATOM 


1351 


H 


GLU 


B 


20 


25 


.129 


20 


.696 


18.442 


0.00 


H 


ATOM 


1352 


N 


ALA 


B 


21 


29 


.122 


21 


.069 


17.024 


.0.00 


N 


ATOM 


1353 


CA 


ALA 


B 


21 


29 


.859 


21 


.221 


15.768 


o:oo 


c 


ATOM 


1354 


C 


ALA 


B 


21 


30.422 


19 


.894 


15.208 


0.00 


c 


ATOM 


1355 


O 


ALA 


B 


21 


31 


.618 


19 


.821 


14.879 


0.00 


0 


ATOM 


1356 


CB 


ALA 


B 


21 


30.954 


22 


.295 


15.900 


0.00 


c 


ATOM 


1357 


OXT 


ALA 


B 


21 


29 


.677 


18 


.897 


15.088 


0.00 . 


01- 


ATOM 


1358 


H 


ALA 


B 


21 


29 


.585 


21 


.298 


17.880 


0.00 


H 


TER 


1359 




ALA 


B 


21 

















EXAMPLE 21 
Oxidative Stability of ASP 

This Example describes experiments conducted to determine the oxidative stability 
of the ASP protease and mutant proteases. The resistance to oxidation of Cellulomonas 
69B4 protease was compared to that of: a BPN'-variant protease (BPN'-variant 1; 
Genencor; See, RE 34,606 [incorporated herein by reference], for a description of this 
enzyme); a GG36 variant protease (GG36-variant 1; Genencor; See e.g., U.S. Pat. Nos. 
5,955,340 and 5,700,676, herein incorporated by reference); and PURAFECT protease 
(Genencor). 

The assay was conducted by incubating a sample of the protease with 0.1 M H 2 0 2 . 
A 2.0 ml volume of 0.1 M Borate buffer (45.4 gm NaB 4 0 7 10 H 2 0), pH 9.45 containing 0.1 
M H 2 0 2 and 100 ppm protease was incubated at 25°C for 20 minutes and assayed for 
enzyme activity. 

The enzyme activity was determined as follows: 50 pi of the incubation mixture was 
combined with 950 pi 0.1 M Tris buffer, pH 8.6 and a sample from 10 pi was taken and 
added to 990 \i\ AAPF substrate solution, cone. 1 mg/ml, in 0.1 M Tris / 0.005% TWEEN®, 
pH 8.6. The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was 
monitored. The results obtained for these proteases are provided in Figure 31 . As 
indicated in this graph, protease 69B4 showed greatly enhanced stability under oxidative 
conditions relative to the subtilisin proteases. 



WO 2005/052146 



PCT/US2004/039066 



- 262 - 

EXAMPLE 22 
Chelate Stability of ASP 

In this Example, experiments to determine the chelate stability of ASP are described. 
The resistance to the presence of a chelator of 69B4 protease was assayed by incubating 
5 an aliquot of the enzyme with 10 mM EDTA in 50 mM Tris, pH 8.2. The same enzyme 
preparations as used in Example 21 were used in these experiments. 

In specific, a volume of 2.0 ml 50 mM Tris buffer, pH 8.2, containing 10 mM EDTA 
and 100 ppm protease was incubated at 45°C for 100 minutes and assayed for enzyme 
activity as follows: 50 pi of the incubation mixture was combined with 950 pi 0.1 M Tris 
10 buffer, pH 8.6 and a sample from 10 pi was taken and added to 990 pi AAPF substrate 
solution, cone. 1 mg/ml, in 0.1 M Tris / 0.005% TWEEN®, pH 8.6 

The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was 
monitored. The results obtained for these four proteases are shown in Figure 32. As 
indicated by these results, protease 69B4 showed greatly enhanced stability in the presence 
is of a chelator than BPN' variant-1 , PURAFECT®, or GG36 varianM . 

EXAMPLE 23 
Thermal Stability of ASP 

In this Example, experiments conducted to determine the thermostability of ASP 
20 protease are described. In one set of experiments, 69B4 protease was tested for 

resistance to thermal inactivation in solution. As in Examples 21 and 22, a BPN' variant 
(BPN'-variant-1), PURAFECT®, and a GG36 variant (GG36-variant-1) were also tested and 
compared with ASP. 

The thermal inactivation was performed by incubating a volume of 2.0 ml 50 mM 
25 Tris buffer, pH 8.0, containing 100 ppm protease at 45°C for 300 minutes and assayed for 
enzyme activity as follows: 50 pi of the incubation mixture was combined with 950 pi 0.1 M 
Tris buffer, pH 8.6 and a sample from 10 pi was taken and added to 990 pi AAPF substrate 
solution, cone. 1 mg/ml, in 0.1 M Tris 70.005% TWEEN®, pH 8.6. The rate of increase in 
absorbance at 410 nm due to release of p-nitroaniline was monitored. The results of these 
30 four proteases are shown in Figure 33. As shown by these results, protease 69B4 showed 
enhanced or comparative thermal stability at 45 degrees centigrade than the BPN' variant, 
PURAFECT®, or the GG36 variant. 

In addition to the above experiments, an alternative method for determining the 
thermostability of ASP was also tested. In these experiments, a temperature gradient 
35 between 57°- 62 °C was used. The thermal inactivation (using a Thermocycler -MTP plate 
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DNA Engine Tetad; MJ Research) was performed by incubating a volume of 180pl 100 nnM 
Tris buffer, pH 8.6, containing 1 mM CaCI 2 and 5 ppm protease for 60 minutes and assayed 
for enzyme activity as follows: 10 pi was taken and added to 190 pi AAPF substrate 
solution, cone. 1 mg/ml, in 0.1 M Tris / 0.005% TWEEN®, pH 8.6. The rate of increase in 
absorbance at 410 nm due to release of p-nitroaniline was monitored (at 25°C). The results 
of 4 proteases are shown in Figure 34. 

EXAMPLE 24 
pH profile of ASP Protease on DMC Substrate 

In this Example, experiments conducted to determine the pH profile of the ASP protease are 
described. The Cellulomonas 69B4 protease of the present invention, isolated and purified by 
methods described herein and three currently used subtilisin proteases (PURAFECT®, BPN'-varianl 
1 , GG36-variant-1) described in Examples 21-23, were analyzed for their ability to hydrolyze a 
commercial synthetic substrate, di-methyl casein ("DMC7 Sigma C-9801) in the pH range from 4 to 
12. 

The DMC method described at the beginning of the Experimental section was used, 
with modifications, as indicated below. Briefly, a 5 mg/ml DMC substrate solution was 
prepared in the appropriate buffer (5 mg/ml DMC, 0.005% (w/w) TWEEN-80® 
(polyoxyethylene sorbitan mono-oleate, Sigma P-1754)). The appropriate DMC buffers 
were composed as follows: 40 mM MES for pH 4 and 5 ; 40 mM HEPES for pH 6 and 7, 40 
mM TRIS for pH 8 and 9; and 40 mM Carbonate for pH 10, 1 1 and 12. 

For the determination, 180 |xl of each pH-substrate solution was transferred into 96 
well microtiter plate and were pre-incubated at 37°C for twenty minutes prior to enzyme 
addition. The respective enzyme solutions (BPN'-varianM ; GG36-variant-1; PURAFECT®; 
and 69B4 protease) were prepared, containing about 25 ppm and 20 pi of these enzyme 
solutions. These enzyme solutions were pipetted into the substrate containing wells in order 
to achieve a 2.5 ppm final enzyme concentration in each well. The 96 well plate containing 
enzyme-substrate mixtures was incubated at 37°C and 300 rpm for one hour in an IKS- 
Multitron incubator/shaker. 

A 2,4,6-trinitrobenzene sulfonate (TNBS") color reaction method was used to 
determine the amount of peptides and amino acids release from DMC substrate. The free 
amino groups (of the peptides and amino acids) react with 2,4,6-trinitro-benzene sulfonic 
acid to form a yellow colored complex. The absorbance was measured at 405 nm in a 
SpectraMax 250 MTP Reader. 
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The TNBS assay was conducted as follows. A 1 mg/ml solution of TNBS ( 5% 
2,4,6 trinitrobenzene sulfonic acid/Sigma-P2297) was prepared in reagent buffer A (2.4 g 
NaOH, 45.4 g Na 2 B4O7.10H 2 O dissolved byheating in 1000ml). Then, 60 m> per well were 
aliquoted into a 96-well plate and 10 pi of the incubation mixture described above were 

5 added to each well and mixed for 20 minutes at room temperature. Then, 200 pi of reagent 
B (70.4 g NaH 2 P0 4 -H 2 0 and 1.2 g Na 2 S0 3 in 2000 ml) were added to each well and mixed 
to stop the reaction. The absorbance at 405 nm was measured in a SpectraMax 250 MTP 
Reader. The absorbance value was corrected for a blank (without enzyme). 
The data in Table 24-1 show the comparative ability of the 69B4 protease to hydrolyze such 

10 substrate versus proteases from a known mutant variants (BPN' variant-1 and GG36 
variant-1). 

Also, as shown in Figure 35, the serine protease of the present invention showed 
comparative or increased hydrolysis of DMC substrate with an optimal DMC-hydrolysis 
activity over a broad pH range from 7 to 12. 



15 



Table 24-1. TNBS Response 


Enzyme 


TNBS response (OD405 nm) 




pH4 


pH5 


pH6 


pH7 


pH8 


pH9 


pH10 


pH11 


pH12 


BPN" 
variant-1 


0.095 


0.174 


0.482 


0.749 


0.813 


0.847 


0.730 


0.683 


0.590 


GG36 
variant-1 


0.228 


0.172 


0.499 


0.740 


0.958 


1.062 


1.068 


1.175 


1.136 


Purafect® 


0.042 


0.202 


0.545 


0.783 


0.956 | 


1.130 


1.102 


1.188 


1.174 ; 


69B4 


0.252 


0.218 


0.575 


0.742 


0.803 


0.965 


0.762 


0.741 


0.729 



EXAMPLE 25 
pH Stability of ASP Protease 

20 In this Example, experiments conducted to determine the pH stability of the ASP 

protease are described. As in Examples 21-24, two currently used subtilisin proteases 
(PURAFECT® and BPN'-varianM) were also tested. 

The respective enzyme solutions {i.e., BPN'-varianM, PURAFECT®, and 69B4 
protease) were prepared containing 90 ppm protease in 0.1 M Citrate buffer, pH 3, 4, 5 and 

25 6. Then, 10 ml tubes containing 1 ml of buffered enzyme solutions were placed in a GFL 
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1 083 water bath set at 25°C, 35°C and 45°C respectively, for 60 minutes. AAPF activity 
was determined for each enzyme sample at time 0 and 60 minutes as described above. The 
remaining enzyme activity was calculated and the results are provided in Table 25-1 below, 
and are shown in Figures 25-28). 

As indicated by the data in Table 25-1, the ASP protease is exceptional stable at pH 
3, 4, 5, and 6, at temperatures between 25°C and 45°C, as compared to the BPN' variant- 1 
and PURAFECT®. 



Table 25-1. pH Stability Data 


pH 


BPN' Variant-1 


PURAFECT® 


ASP 


25° 


35° 


45° 


25° 


35° 


45° 


25° 


35° 


45° 


pH3 


39 


1 


0 


42 


2 


0 


97 


109 


95 


pH4 


92 


35 


1 


55 


7 


0 


106 


105 


102 


pH5 


112 


82 


12 


95 


68 


8 


114 


115 


106 


pH6 


113 


99 


59 


104 


96 


63 


95 


104 


104 



EXAMPLE 26 
Stability and Specificity of ASP 

In this Example, experiments conducted to determine the stability and specificity 
differences between ASP, ASP mutants, and FNA are described. These experiments were 
performed by formulating liquid TIDE® detergent (Procter & Gamble) with calcium formate 
(an anionic surfactant titrant), borate (a P1 binder/inhibitor), and glycerol (water ordering), 
either independently of or in combination with each other. The enzyme was tested under 
these conditions and the residual enzyme activity was determined over time at a fixed 
temperature. 

The experiments are described in greater detail below. Unformulated liquid TIDE® 
detergent (i.e., without added enzyme stabilizing chemicals ) was divided into eleven 
aliquots. Then, glycerol, borax, or calcium formate were added to the detergent aliquots in 
the proportions shown in Table 26-1 . 



Table 26-1. Detergent Additives (%) 


Aliquot # 


% Glycerol 


% Borax 


% Calcium 






Formate 


1 


5 


0 


.1 
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2 


2.5 


1.5 


.05 


3 


5 


3 I 


0 


4 


0 


3 


o : 


5 


2.5 


1.5 


.05 


6 


0 


0 


.1 


7 


0 


3 


.1 


8 


0 


0 


0 


9 


5 


0 


0 


10 


2.5 


1.5 


.05 


11 


5 


3 


.1 



Each aliquot was pre-warmed to 90T, and either FNA, ASP (wild-type) or an ASP 
R18 variant was added to approximately one gram per liter protease. After thorough mixing, 
a portion was removed and assayed for activity with synthetic AAPF-pNA substrate, as 
described above. After the assay, each aliquot was placed back into a 90T oven. The 
assay process was repeated over time, and the decline in activity at TO was plotted as a % 
TO activity remaining. 

Surprisingly, it was found that ASP did not have the same calcium formate or 
glycerol dependency as FNA. Furthermore, it was determined that borate (alone) had the 
most dramatic effect on stabilizing ASP. It was also found that the addition of stabilizing 
chemicals provided significant benefits to the wild-type ASP, as well as the ASP R18 variant, 
indicating that the variant site is independent of the borate-activated site. 

EXAMPLE 27 
LAS Stability of ASP 

In this Example, experiments conducted to determine the stability of ASP to anionic 
surfactants are described. LAS (linear alkyl sulfonate), an anionic surfactant, is a 
component of HDL detergents known to inactivate enzymes. The methods used are 
described above. 

It was determined that wild-type ASP incubated in IAS dissolved in Tris HCI pH 8.6 
is inactivated (See, Table 27-1, below). Further study revealed that inactivation is rapid 
(See, Table 27-2). As LAS is a negatively charged molecule, the hypothesis that 
electrostatic attraction of LAS with positively charged amino-acid side chains of ASP was the 
cause of the LAS sensitivity, was developed. To test this hypothesis, arginine residues 
(wild-type ASP contains no lysine residues), were mutated to other amino-acids. 

Incubation of these mutants in 0.05%(w/v) LAS in Tris HCI pH8.6, for one hour 
revealed that all arginine replacement mutants were more stable than wild-type ASP. In 
contrast, non-arginine replacement mutations that were also tested for LAS stability were 
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generally not improved compared to wild-type (See, Table 27-3). Subsequent multiple 
arginine replacement mutations revealed that the enzyme is substantially more stable than 
the wild-type enzyme, and more stable that single arginine replacement mutations (See, 
Table 27-4). 

Another anionic surfactant that is used in HDL detergents is AES. Wild-type ASP 
was found to be unstable in high concentrations of AES (See, Table 27-5). The mutant ASP 
R18 was found to be more stable than wild-type in AES (See, Table 27-5). Also, the rate of 
inactivation of activity by 5% AES was found to be higher for the wild-type than the ASP R18 
mutant (See, Table 27-6). These results confirm that replacement of arginine residues of 
ASP improves the stability of ASP in anionic detergents in general. It is not intended that 
the present invention be limited to any specific anionic detergents or mutations. Indeed, it is 
contemplated that various anionic detergents (as well as other detergents) will find use in 
the present invention, as will various ASP mutants. 



Table 27-1. Inactivation of ASP by LAS in Tris HCI pH 8.6 



%LAS (w/v) 



% Activity of Control 



Control (0 LAS) 



100 



0.01 
0.03 
0.06 
0.10 
0.30 
0.60 
1.00 



87 
77 
59 
47 
31 
20 
12 



Table 27-2. Time-course of ASP Inactivation by 0.1% LAS 



Time (sees) 



% Remaining Activity 



0 
60 
120 
240 
600 



100 

45 

26 

20 

11 
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Table 27-3. Stability of ASP and Single Mutants 
(Incubated 0.05% LAS in Tris HCI, pH 8.6, for 60 mins.) 



20 



Mutant 


% Remainina Ac 


Wild-tvnP 


18 


R14I 


47 


n i Di 


49 


n IOU 


Ou 


R16Q 


51 


R35F 


43 


R127A 


59 


R127K 


31 


R127Q 


52 


R159K 


25 


T36S 


11 


G65Q 


22 


Y75G 


7 


N76L 


17 


S76V 


17 



25. 



30 



35 



Table 27-4. Stability of ASP and Multiple Arginine Replacements 
(Incubated 0.05% LAS in Tris HCI, pH 8.6. for 60mins) 

Mutant % Remaining Activity of 0 LAS Control 

Wild-type 27.5 

ASPR-1 98.8 

ASPR-2 69.6 

ASP R-3 100.2 

ASP R-7 103.9 

ASP R-10B 98.9 

ASP R-18 100.9 

ASP R23 79.4 



40 In this Table, 

R-1=R16Q/R35F/R159Q 

R-2=R159Q 

R-3=R16Q/R123L 

R-7=R14L/R127Q/R159Q 
45 R-10B=R14L/R179Q 

R-1 8=R123L/R127Q/R1 79Q. 

R-21=R16Q/R79T/R127Q 

R-23=R16Q/R79T 



50 
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Table 27-5. Inactivation of ASP and ASP Mutant R-18 by AES in Tris HCI pH8.6 
%Remaining activity of 0% AES control 

s % AES(v/w) Wild-type ASP ASP R-1 8 

0 100 100 

1 70 94 
5 32 57 



10 



15 



Table 27-6. Time-course of ASP and Mutant R-18 Inactivation 
by 5% AES in Tris HCI, pH 8.6 

% Remaining Activity of 0% AES Control 

Time (Mins) Wild-type ASP ASP R-1 28 

0 100 100 

90 99 105 

4020 15 83 



20 



EXAMPLE 28 

25 Determination of ASP Autolysis Sites in the Presence and Absence of LAS Detergent 

In this Example, experiments conducted to determine the ASP autolysis sits in the 
presence and absence of LAS are described. ASP autolysis was evaluated in a buffer with 
and without LAS (dodecylbenzene-sulfonic acid). Autolysis peptide assignments were made 
based on molecular weight and sequence of each peptide (from MS and MS/MS data, 
30 respectively). 

ASP (at concentration of 0.35ug/uL) was incubated (at 4 # C) in a 100mM Tris pH 8.6 
with and without 0.1%LAS (dodecylbenzene-sulfonic acid). Aliquots were taken at time 
periods from 0 to 30 min of incubation and autolysis was terminated by an addition of TFA 
(final concentration 1%). Aliquots (10pL) were analyzed by liquid chromatography coupled 

35 with electrospray tandem mass spectrometry (LC-ESI-MS/MS). Peptides were resolved 
using an HPLC system (model 1 100, Agilent Technologies) using a reversed-phase column 
(Vydac C4, 0.3mmlD x 150mm), and a gradient from 0 to 100% solvent B (0.1%formic acid 
in acetonitrile) in 60 min at a flow rate of SpL/min (generated using a static split from a pump 
flow rate of 250uL7min). Solvent A consisted of 0.1% formic acid in water; and solvent B 

40 was 0.1 % formic acid in acetonitrile. 

Mass spectra were acquired using ion trap mass spectrometer (model LCQ Classic, 
Thermo). The mass spectrometer was tuned for optimum detection of m/z of 785 and 
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operated with spray voltage of 2.5kV, and a heated capillary at 250°C. Mass spectra were 
acquired with injection time of 500 msec and 5 microscans. Tandem MS spectra were 
acquired in data-dependent mode, with the most intense peak selected and fragmented with 
a normalized collision energy of 35%. For relative peptide quantitation, peak areas were 
determined using vendor software. The identity of the autolysis peptides was determined 
using a database search program (TurboSequest, Thermo) run on a database containing 
ASP sequence. Database searches were performed with no enzyme selected, threshold of 
10000, dtafile parameters (peptide m/z error of 1.7, group 11, minimum ion count 15), and 
database parameters (peptide error of 2.2, MS/MS ions error of 0.0, both B,Y ions). 

Without LAS in the sample buffer, ASP cleavages were primarily observed at the 
termini and in the middle of the molecule (positions Y9, F47, Y59, F165, Q174, Y176; See 
Table 28-1, below). Relative quantitative data for observed peptides and intact ASP was 
plotted over the course of the experiment (See, Figure 25, Panel A). The majority of the 
ASP remained intact and only 1% was in the form of cleaved peptides (proteimpeptide ratio 
of 99:1) These data indicated that the majority of ASP remains intact, folded, and resistant 
to further autolytic cleavage. 

With 0.1% LAS in the sample buffer, ASP cleavages were observed thoughout the 
protein (positions Y9, T40, F47, Y57, F59, R61, L69, F165, Q174, Y176). The majority of 
the ASP was in the peptide form after 10min (See, Figure 25, Panel B). After 60 min, the 
proteimpeptide ratio was <1:99. These data indicate that ASP is totally unfolded in the 
presence of LAS detergent, thus extensive cleavage throughout the sequence was 
observed. The observed autolysis cleavage sites under the two conditions are summarized 
in the following Table. In this Table, the amino acids preceding and following the periods 
are the amino acids that immediately precede and follow the autolysis peptide. The 
sequence between the periods indicates the sequence of the autolysis peptides observed. 



Table 28-1. ASP Autolysis Peptides Observed With and Without 0.1% LAS 


Peptide Sequence 










Observed 


Start -End 


Calculated 


Measured 


Observed in 


in 






Mass (Da) 


Mass (Da) 


0.1%LAS 


0% LAS 


•.FDVIGGNAY.T (SEQ ID NO:631) 


[1-9] 


954.5 


954.4 


Y 


Y 


T.ANPTGTF.A (SEQ ID NO:632) 


[41-47] 


706.3 


706.3 


Y 


N 


F.AGSSFPGNDY.A (SEQ ID NO:633) 


(48-57] 


1013.4 


1013.3 


Y 


IM 


F.AGSSFPGNDYAF.V (SEQ ID NO:634) 


[48-59] 


1231.5 


1231.4 


Y 


Y 
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R.TGAGVNU.A(SEQ,DNO:635) 

FFQPVNP|.L(SEQIDN0:636) 

FFQPVNPILQ.A(SEQ,DNO:637) 
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f62-69J 
f166-172] 
[166-174] 
[166-1761 
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743.3 


743.4 




I N 


813.4 




J Y 

I N 


1054.6 


1054.5 


N 


1288.^ 


—1288,3 


Y | 


Y 
Y 
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EXAMPLE 29 

**cnbed above, BZA was shown to inhiwt ° ' he , S ' andartl s "<*-MP F .pNA assay as 




15 



Approximately 200ug/mi ASP M , a * *k . 

~a t io„to 1 Om M ,whio hb y,e,er e „ cet !r h ™ S B2A 

enzyme incubated »*, o,% us and ~* 6 ** 
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EXAMPLE 30 
Testing of Mutant ASPs 

,„ addition to ft. teste described above, teste were conducted on various mutants of 
ASP T e letods described above in Example, were used, In fte« Tables 
^ant rTde- provides the wild-type amino ao«. fte pos*on in the amino acd sepuance 
andfterepiacemen, amino acK (i.e.. 1=001* indicates thaUha pbenyiaianme » 1 
TZ amino acid seguence has been replaced by afcnine in ftis parttcutor vanan,,. 

AS P valtl^ show adttvtty on ft, substrate in the Keratin assay as 

«*. Ke-in in ~r Plates"). The vaiues are relattve ,o w, d type 
( protease Assay oroce dure. Values greater than 1 are 

(WT) and calculated as descnbed in the assay proceuu 
Indicative ot better activity than WT ASP. 



Table 30-1. Keratin Hydrolysis Results 



25 



Variant 
code 


Keratin 
hydrolysis 
relative 


F001T . 


1.24 


F001D 


1.13 


F001H 


1.04 


F001M 


1.01 


F001E 


1.01 



V003L 


1.08 


I004E 


1.00 


N007L 


1.18 


A008E 


1.18 


A008G 


1.13 


A008D 


1.04 


T010N 


1.27 


T010E 


u 1 - 20 


T010D 


1.13 



roioG 


1.04 


1011 A 


1.01 


G012D 


1.17 


G013S 


1.16 


G013M 


1.02 


G013A 


1.01 


R014L 


1.52 


R014Q 


1.4S 


R014I 


1.4C 
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R014D 


1.361 


R014N 


1.2 




R014G 


1.2 


8 


R014T 


1.2 


1 


R014M 


1.2 


1 


R014K 


1.1 


8 


R014A 


1.1 


2 


R014S 


1.1 


2 


R014W 


1.0 


7 


R014P 


1.0 


4 


R014H 


1.0: 


3 


S015W 


1.2( 


] 


S015T 


1.0! 




R016A 


1.0< 


} 


R016S 


1.0: 


3 


R016Q 


1.01 




1019V 


1.1 




N024E 


2.4^ 




N024A 


1 ■ / C 


> 


N024T 


1 5 C 




N024Q 


1 4C 




N024V 


1 2£ 


N024L 


1.26 




N024H 


1 26 


N024M 


1.14 


N024F 


1.05 


N024S 


1.03 


R035E 


1.60 


R035L 


1.47 


R035Q 


1.42 


R035F 


1.41 


R035A 


1.37 


R035K 


1.26 


R035T 


122 


R035H 


1.18 


R035M 


1.17 


R035Y 


1.16 


R035W 


1.13 


R035S 


1.12 


R035D 


1.07 


R035N 


1.03 


R035V 


1.02 


F036I 


6.82 


T036S 


1.34 


T036G 


1.34 


T036N 


1.22 


T036D 


1.16 


T036H 


1.13 


T036P 


1.03 


F036L 


1.01 
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A038R 


1.77 


A038D 


1.51 


A038H 


1.30 


A038N 


I 1.2$ 


A038F 


1.22 


A038L 


1.19 


A038S 


1.18 


A038Y 


1.17 


A038T 


1.10 


A038V 


1.07 


A038G 


1.03 


A038I 


1.01 


T040V 


1.11 


A041N 


1 17 


A041D 


1 17 


A041I 


1 07 


A041L 


1 03 


T044E 


1 0*3 


A048E 


1 09 


G049A 


1 3fi 


G049S 


1 Pfi 


G049H 


1 1fi 


G049F 


1 19 
1 . 1 v 


G049L 


1 04 


G049T 


' 1 on 


S051D 


1 33 


S051Q 


1 18 


S051H 


1 12 


S051V 


1.11 


S051T 


1.09 


S051M 


1.01 


G054D 


1.71 


G054E 


1.23 


G054N 


1.06 


G054L 


1.02 


G054I 


1.00 


N055E 


1.30 


N055F 


1.25 


N055Q 


1.05 


R061M 


1.20 


R061T 


1.16 


R061E 


1.15 


R061H 


1.10 


R061S 


1.0S 


R061N 


1.08 


R061K 


1.07 


R061V 


1.01 


f062l 


1.00 


G063D 


1.18 


G063V 


1.07 





A064I 


1 40 


A064N 


1 21 


A064Y 


1 1Q 


A064L 


1 17 


A064V 


1 17 
i.i/ 


A064H 


1 1fi 

L ■ • 1 1 


A064F 


I . i o 


A064P 


I . I o 


A064T 


1 io 

1 . 1 c 


A064O 


1 i*i 


A064M 


1 HQ 

[ l . I c 




1-1 


rVUUHVV 


-f f\r\ 

1.09 


r\UD*fva 


1.0 


ClUDOr 


H Art 

1.42 




a on 

i.2£ 




^ on 

1.2S 




1.25 


OUOO 1 


, 1.25 


OUDOV 


1.23 


uUDOL . 


1.21 


uUDOT 


1.16 


VJIUDOM 


1.05 . 


vauoori 


1.02 


IMUD/ U 


1.36 


IMUD / O 


a\ on 

1.20 




1.12 


N0R7P 


1 .12 




■i a n 
1,10 


N067H 


1 .no 


N067A 


1 no 


N067Q 


i n7 

1 Ajf 


N067L 


1 ot; 




L068H 


1 07 




L069S 






L069H 






J)69V 






W70D 


1 90 




<\070H 


1 1fi 




W70G 


1 19 




W70S 


1 04 


( 


3071 G 


1 20 


( 


3071 H 


1 14 


C 


3071 D 


1.13 


C 


3071 S 


1.10 


c 


3071 A 


1.07 


c 


5071 N 


1.06 


c 


50711 


1.06 




'0721 


1.11 




J073T 


1.95 
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N073S 


1.07 


N074G 


1.75 


Y075G 


1.42 


Y075F 


1.24 


S076D 


1.69 


S076V 


1.48 


S076E 


1.47 


S076Y 


1.45 


S076T 


1.25 


S076L 


1.25^ 


S076N 


1.24 


S076I 


1.22 


S076W 


1.17 


S076Q 


1.13 


S076A 


1.08 


G077T 


2.13 


G077S 


1.21 


G077N 


1.06 


G078D 


1.35 


G078A 


1.27 


G078S 


1.07 


G078N 


1.07 


G078V 


1.03 


G078T 


1.00 


R079G 


1.48 


R079D 


1.44 


R079P 


1.43 


R079A 


1.31 


R079E 


1.31 


R079L 


1.25 


R079V 


1.25 


R079T 


1.23 


R079M 


1.23 


R079S 


1.23 


R079C 


1.02 


V080L 


1.03 


Q081E 


1.22 


Q081D 


1.12 


Q081V 


1.10 


Q081H 


1.10 


Q081P 


1.01 


A083E 


1.27 


A083L 


1.05 


A083I 


1.03 


H085Q 


1.26 


H085T 


1.22 


H085L 


1.14 


-I085M 


1.10 


H085A 


1.06 


H085S 


1.02 
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T086D 


1.33 


T086E 


1.24 


T086I 


1.08 


T086L 


1.07 


T086Q 


1.07 


T086G 


1.06 


T086A 


1.05 


T086N 


1.01 


A088E 


1.01 


A088F 


1.00 


P089E 


1.04 


V090P 


1.51 


V090S 


1.42 


V090I 


1.34 


V090T 


1.22 


V090N 


1.10 


V090A 


1.08 


V090L 


1.06 


S092G 


1.20 


S092A 


1.12 


S092C 


1.06 


A093D 


1.20 


A093S 


1.12 


A093E 


1.09 


S099N 


1.27 


S099V 


1.23 


S099D 


1.21 


S099T 


1.21 


S099I 


1.08 


T101S 


1.14 


W103M 


1.17 


T107E 


1.32 


T107S 


1.30 


T107V 


1.23 


T107H 


1.23 


T107M 


1.21 


T107I 


1.17 


T107N 


1.12 


T107A 


1.10 


T107Q 


1.03 


T107K 


1.01 


T109E 


1.36 


T109I 


1.11 


T109G 


1.10 


T109A 


1.10 


T109L 


1.08 


T109P 


1.05 


T109H 


1.03 


T109N 


1.00 


A110S 


1.10 



A110T 


1.03 


A110H 


1.01 


L111E 


1.08 


N112E 


1.61 


N112D 


1.42 


N112Q 


1.36 


N112L 


1.27 


N112V 


1.23 


N112Y 


1.20 


N112I 


1.13 


N112S 


1.06 


N112R 


1.04 


S113T 


1.21 


S114A | 


1.12 


V115A 


1.15 


T116E 


1.34 


T116Q 


1.28 


T116F 


1.09 


T116S 


1.02 


T121E 


1.35 


T121D 


1.15 


T121S 


1.05 


R123E 


1.63 


R123D 


1.57 


R123I 


1.48 


R123F 


1.40 


R123A 


1.30 


R123L 


1.30 


R123Q 


1.29 


R123N 


1.24 


R123H . 


1.22 


R123T 


1.16 


R123Y 


1.15 


R123S 


1.12 


R123G 


1.11 


R123V 


1.09 


R123W 


1.07 


R123K 


1.07 


G124A 


1.06 


I126L 


1.06 


R127A 


1.38 


R127Q 


1.23 


R127H 


1.1S 


R127S 


1.19 


R127K 


1.17 


R127Y 


1.15 


R127E 


1.14 


R127F 


1.11 


R127T 


1.04 


R127C 


1.01 



WO 2005/052146 



PCT/US2004/039066 



R014P 


12.43 


S015R 


57.77 


S015H 


53.39 


S015C 


50.38 


S015E 


25.99 


S015Y 


23.97 


S015M 


19.73 


S015F 


17.11 


S015N 


16.21 


S015G 


14.44 ! 


S015L 


12.00 


S015A 


11.84 


S015T 


11.83 


S015I 


10.89 


R016E 


34.61 


R016T 


27.36 


R016C 


25.97 


R016V 


25.79 


R016D 


22.22 


R016Q 


19.87 


R016I 


19.83 


R016S 


10.71 


A022C 


27.48 


A022S 


25.99 


N024E 


23.54 


N024T 


18.16 


N024G 


15.54 


N024S 


14.04 


N024F 


13.05 


N024V 


11.86 


I028V 


14.49 


R035E 


88.92 


R035D 


76.48 


R035Q 


49.08 


R035V 


49.02 


R035S 


47.13 


R035T 


44.84 


R035N 


42.49 


R035A 


42.38 


R035C 


41.31 


R035P 


32.50 f 


R035H 


27.88 


R035M 


25.29 


R035K 


15.26 


F036C 


! 25.91 


T036V 


20.77 


A038D 


47.40 


A038C 


34.28 


A038T 


12.27 


A041D 


24.80 



-291 - 



A041C 


23.37 


A041T 


18.58 


A041S 


15.58 


N042D 


15.04 


N042C 


13.16 


T044E 


33.74 


T044C 


17.24 


T046V 


40.22 


T046F 


34.46 


T046E 


34.01 


T046Y 


27.10 


T046C 


23.20 


F047R 


46;98 


F047V 


20.38 


F047I 


12.72 


A048E 


29.23 


G049C 


64.06 


G049Q 


49.53 


G049E 


48.76 


G049H 


47.79 I 


G049A 


43.93 


G049V 


43.28 


G049N 


29.58 


G049L 


24.93 


G049S 


19.86 


G049F 


16.65 


G049K 


15.46 


G049T 


11.73 


S051L 


19.79 


S051A 


15.12 


S051C 


14.59 


S051G 


14.33 


P053C 


11.51 


P053N 


10.68 I 


G054C 


26.41 


G054E 


19.88 


G054Q 


12.71 


G054K 


11.71 


N055G 


33.29 


N055A 


15.31 


D056L 


42.96 


D056F 


17.11 


Y057G 


27.33 


F059W 


31.25 


R061E 


30.95 


R061V 


I 26.22 


R061M 


I 26.01 


R061T 


23.33 


R061K 


20.21 


R061Q 


18.05 



G063D 


13.79 


A064C 


15.65 


G065D 


14.73 


V066N 


16.37 


A070M 


21.09 


A070G 


15.83 


A070P 


14.86 


Q071L 


11.17 


Y075W 


10.97 


G078H 


12.06 


R079T 


16.18 


R079V 


15.24 


R079L 


12.03 


V080E 


10.65 


Q081P 


18.28 


Q081G 


15.49 


Q081A 


14.60 


Q081E 


14.36 


Q081H 


14.02 


Q081S 


13.51 


Q081D 


13.17 


Q081Y 


13.15 


Q081F 


12.61 


Q081I 


11.93 


Q081W 


11.89 


Q081C 


11.40 


A083H 


17.04 


A083D 


I 15.14 


A083E 


14.66 


A083Y 


12.54 


A083V 


11.93 


A083N 


11.52 


A083M 


11.35 


A083F 


11.21 


A083I 


10.80 


H085P 


10.62 


T086E 


16.60 


T086I 


13.95 


T086C 


13.70 


T086W 


13.45 


T086V 


12.92 


T086Y 


10.97 


T086F 


10.78 


T086D 


10.70 


A087E 


20.99 


A087C 


17.19 


A087P 


11.78 I 


A088F 


18.06 


A088E 


14.11 


A088V 


13.47 
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A088H 


10.95 


P089D 


10.88 


V090C 


12.71 


G091Q 


23.98 


S092T 


17.35 


S092I 


11.15 


S092C ; 


10.93 


S092L 


10.60 


A093H 


14.05 


S099A 


28.58 


S099G 


22.20 


S099K 


17.98 


S099Q 


17.50 


S099H 


15.09 


T100A 


27.16 


T100R 


22.31 


T100K 


22.07 


T100Q 


15.53 


T100C 


11.47 


W103L 


20.25 


H104M 


10.65 


T107R 


26.61 


T107H 


12.35 


T109E 


24.23 


T109K 


17.25 


N112P 


25.16 


N112E 


17.68 


N112D 


15.90 


S113C 


35.77 


S113A 


16.28 


S113D 


14.68 


S113H 


13.27 


S114C 


22:24 


S114E 


16.60 


S114D 


11.86 


T116C 


16.41 


T116N 


14.90 


T116G 


14.42 


T116A 


11.29 


P118R 


28.25 


P118K 


23.28 


P118C 


16.70 


P118A 


15.98 


P118W 


15.50 


P118G 


14.55 


P118H 


13.73 


P118F 


12.80 


P118Y 


11.29 


E119G 


32.98 


E119Y 


29.43 
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E119R 


26.97 


E119T 


26.28 I 


E119V 


24.47 


E119N 


20.71 


E119A 


19.95 


E119L 


15.83 


E119S 


15.80 


E119Q 


14.68 


T121E 


36.49 


T121L 


34.33 


T121F 


23.82 ! 


T121A 


17.78 | 


T121D 


16.73 ! 


T121V 


14.25 


T121Q 


12.39 


T121G 


12.17 


T121S 


11.93 


T121N 


11.51 


R123D 


48.24 


R123Y 


47.97 


R123C 


46.46 


R123E 


44.33 


R123N 


40.60 


R123H 


39.41 


R123T 


34.97 


R123W 


33.83 


R123F 


30.58 


R123S 


30.56 


R123Q 


25.60 


R123V 


24.71 


R123M 


18.54 


R123A 


17.24 


R123K 


16.38 


R123G 


16.12 I 


R123I 


16.04 


G124D 


25.10 


G124N 


12.84 


L125Q 


25.77 


L125M 


14.90 I 


R127E 


36.18 


R127S 


31.24 


R127D 


I 29.46 : 


R127Q 


I 27.92 


R127K 


25.25 


R127A 


21.74 


R127C 


16.40 


R127T 


14.31 


R127Y 


I 13.61 


R127H 


12.89 


R127F 


10.69 



T128A 


21.49 


T128V 


12.94 


V130C 


12.97 


A132S 


19.09 


A132P 


11.71 


P134R 


22.20 


S140P 


21.06 


L141M 


18.59 


L141C 


12.46 


A143H | 


10.95 


G144E i 


12.63 


N145E 


12.29 


Q146D 


12.05 


T151L 


46.42 


T151C 


26.57 


T151V 


17.57 | 


S155C 


38.40 


S155W 


30.61 


S155Y 


23.95 


S155I 


22.60 


S155V 


21.53 


S155E 


19.78 


S155T 


17.58 


S155F 


17.11 


S155Q 


12.59 


N157D 


18.83 


R159T 


28.61 


R159E 


27.00 


R159Q 


25.25 


R159D 


23.12 


R159V 


22.92 


R159S 


22.29 


R159K 


20.78 


R159N 


19.95 


R159C 


19.24 


R159A 


19.09 


R159M 


15.74 


R159L 


14.00 


R159H 


12.56 • 


R159Y 


11.23 I 


T160D 


15.18 


T160E 


11.72 


T163D 


23.84 


T163C 


19.09 


T163Q 


14.20 


T163R 


11.15 


F165W 


28.00 


F165E 


23.57 


F165H 


21.46 


F165S 


14.33 
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Q167E 


64.13 


Q167S 


12.59 


V169A 


12.75 


N170D 


29.08 


N170C 


23.07 


N170L 


14.63 


N170G 


13.30 


N170A 


12.77 


N170P 

Nil VI'. 


12.72 




20.40 


G174C 


16.62 




14.76 




14 54 


Q174V 


13.40 


Q174H 


11.18 


A175T 


[ 16.19 


G177D 


24.74 


G177E 


21.37 


G177C 


14.01 


G177N 


11.53 


R179E 


25.06 
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R179D 


24.16 


R179C 


20.71 


R179V 


20.09 


R179I 


19.51 


R179T 


19.20 


R179Y 


17.89 


R179M 


16.74 


R179S 


16.12 


R179N 


16.11 


R179F 


15.67 


R179W 


15.56 


R179L 


15.12 


R179A 


14.35 


R179K 


12.30 


M180L 


25.64 


M180I 


12.31 


I181C 


11.51 


T182L 


12.63 


T183D 


13.51 


T183E 


13.32 


S185D 


14.31 



S185C 


13.10 


S185Y 


10.74 


S185N 


10.73 


G186E 


14.36 


G186P 


13.48 


G186C 


11.96 


S187E 


15.92 


S187F 


13.28 


S187L 


12.26 


S187C 


11.34 


S187W 


11.21 


S187G 


10.83 


S187A 


10.72 


S187V 


10.71 


S187H 


10.66 


S188E 


15.00 


S188C 


12.56 


S188T 


i 11.89 


S188G 


! 11.15 


S188V 10.68 



EXAMPLE 31 
Determination of ASP Cleaning Activity 

In this Example, experiments conducted to determine the cleaning activity of ASP 
under various conditions, as well as the properties of the various wash conditions are 
described. 

There is a wide variety of wash conditions including varying detergent formulations, 
wash water volume, wash water temperature, and length of wash time. Thus, detergent 
components such as proteases must be able to tolerate and function under adverse 
environmental conditions. For example, detergent formulations used in different areas have 
different concentrations of their relevant components present in the wash water. For 
example, a European detergent typically has about 3000-8000 ppm of detergent 
components in the wash water, while a Japanese detergent typically has less than 800 {e.g., 
667 ppm) of detergent components in the wash water. In North America, particularly the 
United States, detergent typically have about 800 to 2000 {e.g., 975 ppm) of detergent 
components present in the wash water. 

Latin American detergents are generally high suds phosphate builder detergents and 
the range of detergents used in Latin America can fall in both the medium and high 
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detergent concentrations, as they range from 1500 ppm to 6000 ppm of detergent 
components in the wash water. Brazilian detergents typically has approximately 1500 ppm 
of detergent components present in the wash water. However, other high suds phosphate 
builder detergent geographies, not limited to other Latin American countries, may have high 
s detergent concentration systems up to about 6000 ppm of detergent components present in 
the wash water. 

In light of the foregoing, it is evident that concentrations of detergent compositions in 
typical wash solutions throughout the world varies from less than about 800 ppm of 
detergent composition ("low detergent concentration geographies"), for example about 667 

10 ppm in Japan, to between about 800 ppm to about 2000 ppm ("medium detergent 

concentration geographies"), for example about 975 ppm in U.S. and about 1500 ppm in 
Brazil, to greater than about 2000 ppm ("high detergent concentration geographies")* for 
example about 3000 ppm to about 8000 ppm in Europe and about 6000 ppm in high suds 
phosphate buildqr geographies. 

15 The concentrations of the typical wash solutions are determined empirically. For 

example, in the U.S., a typical washing machine holds a volume of about 64.4 L of wash 
solution. Accordingly, in order to obtain a concentration of about 975 ppm of detergent 
within the wash solution, about 62.79 g of detergent composition must be added to the 64.4 
L of wash solution. This amount is the typical amount measured into the wash water by the 

20 consumer using the measuring cup provided with the detergent. 

As a further example, different geographies use different wash temperatures. The 
temperature of the wash water in Japan is typically less than that used in Europe. For 
example, the temperature of the wash water in North America and Japan can be between 
10 and 30 # C (e.g., about 20 C C), whereas the temperature of wash water in Europe is 

25 typically between 30 and 50'C (e.g., about 40°C). 

As a further example, different geographies may have different water hardness. 
Water hardness is typically described as grains per gallon mixed Ca 2 7Mg 2+ . Hardness is a 
measure of the amount of calcium (Ca 2+ ) and magnesium (Mg 2+ ) in the water. Most water in 
the United States is hard, but the degree of hardness varies from area to area. Moderately 

30 hard (60-120 ppm) to hard (121-181 ppm) water has 60 to 181 parts per million (i.e., parts 
per million converted to grains per U.S. gallon is ppm # divided by 17.1 equals grains per 
gallon) of hardness minerals. Table 31-1 provides ranges of water hardness. 



Table 31-1. Water Hardness Ranges 


Water 


Grains per Gallon 


Parts per Million 
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OOTX 


Ipqq than 1 0 


|p^«; than 17 


Slightly hard 


1.0 to 3.5 


17 to 60 


Moderately hard 


3.5 to 7.0 


60 to 120 


Hard 


7.0 to 10.5 


120 to 180 


Very hard 


greater than 10.5 


greater than 180 



European water hardness is typically greater than 10.5 (e.g., 10.5-20.0) grains per 
gallon mixed Ca 2 7Mg 2 * (e.g., about 15 grains per gallon mixed Ca 2+ /Mg 2+ ). North American 
water hardness is typically greater than Japanese water hardness, but less than European 
water hardness. For example, North American water hardness can be between 3 to10 
grains, 3-8 grains or about 6 grains. Japanese water hardness is typically lower than North 
American water hardness, typically less than 4, for example 3 grains per gallon mixed 
Ca 2+ /Mg 2+ . 

The present invention provides protease variants that provide improved wash 
performance in at least one set of wash conditions and typically in multiple wash conditions. 

As described herein, the protease variants are tested for performance in different 
types, of detergent and wash conditions using a microswatch assay (See above, and U.S. 
Pat. Appln. Ser. No. 09/554,992; and WO 99/34011, both of which are incorporated by 
reference herein). Protease variants are tested for other soil substrates also in a similar 
fashion. 

In the experiments conducted to determine cleaning activity of ASP, the following 
methods were used. Incubators (Innova 4330 Model Incubator, New Brunswick) was pre- 
warmed for 60 minutes to 40 9 C for "European" conditions and for 20 s C for "Japanese" 
conditions. Blood-Milk-Ink swatches (EMPA 116) were obtained from the Swiss Federal 
Laboratories for Material Testing and from CFT Research, and were modified by exposure 
to 0.03 % hydrogen peroxide for 30 minutes at 60 9 C, then dried. Circles of 1/4" diameter 
were cut from the dried swatches and placed vertically, one per well, in a 96 well microplate. 

Protease samples of ASP were diluted in 10 mM NaCI, 0.005% TWEEN®-80 to 
provide the desired concentration of 10 ppm (protein). To provide "North American wash 
conditions," 1 gram per liter TIDE® laundry detergent (Procter & Gamble) without bleach 
was prepared in deionized water, and a concentrated stock of calcium and magnesium was 
added to result in a final water hardness value of 6 grains per gallon. To provide "European 
wash conditions," 7.6 gram per liter ARIEL® REGULAR laundry detergent (Procter & 
Gamble) without bleach was prepared in deionized water, and a concentrated stock of 
calcium and magnesium was added to result in a final water hardness value of 15 grains per 
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gallon. To provide "Japanese wash conditions," 0.67 gram per liter PURE CLEAN laundry 
detergent (Procter & Gamble) without bleach was prepared in deionized water, and a 
concentrated stock of calcium and magnesium was added to result in a final water hardness 
value of 3 grains per gallon. 

In yet another detergent composition to provide "Japanese wash conditions with 
North American detergent formulation," 0.66 gram per liter Detergent Composition III without 
bleach was prepared in deionized water, and a concentrated stock of calcium and 
magnesium was added to result in a final water hardness value of 3 grains per gallon. 

The detergent solutions were allowed to mix for 15 minutes and were then filtered 
through a 0.2 micron cellulose acetate filter. A 190 ul of the respective detergent solution 
was then added to the appropriate wells of a microplate. Then, 10 ul of the enzyme 
preparation were added to the filtered detergent in order to obtain a final concentration 0.25- 
3.0 ppm (micrograms per milliliter) of enzyme, for a total volume of 200 pi. The microplate 
was then sealed to prevent leakage, placed in a holder on an incubator/shaker set to 20 9 C 
and 350/400 RPM and allowed to shake for one hour. 

The plate was then removed from the incubator/shaker and an aliquot of 100pl of 
solution was removed from each well, and placed on a fresh Costar microtiter plate 
(Corning). The absorbance at 405 nm wavelength was read for each aliquot on a Microtiter 
plate reader (SpectraMax 340, Molecular Devices), and reported. The detergent 
composition and incubation conditions in the microswatch assay are set forth in Table 31-2. 



Table 31-2. Detergent Composition and Incubation Conditions 



Geography Detergent 


Water 
Hardness 


Enzyme 
dosage 


Temperatur 
e 


Swatch 


Powder 
detergent 












European 


Ariel Regular 
7.6 g/l 


i5gpg 
Ca/Mg=4/1 


0.25-3.0 
ppm 


40° 


Superfix 


North American 


Detergent 
Comp. Ill 
1.0 g/l 


6gpg 

Ca/Mg=3/1 


0.25-3.0 
ppm 


20° 


3K 


Japanese 


Pure Clean 
0.66 g/l 


3gpg 

Ca/Mg=3/1 


0.25-3.0 
ppm 


20° 


3K 


Japanese 


Detergent 
Comp. Ill 


3gpg 


0.25-3.0 
ppm 


20° 


3K 


(pseudo) 


0.66 g/l 


Ca/Mg=3/1 








Liquid detergent 


Liquid-Tide® 


6gpg 


0.25-3.0 
ppm 


20° 


3K 
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(1-5ml/L) | | | | I I 

The dose response curves depicting absorbance at 405 nm as a function of 
concentration (ppm in well), for PURAFECT® (Genencor), OPTIMASE® (Genencor), 
RELASE™ (Genencor; GG36-variant described above), and ASP are provided in Figures 
23-27). 

As indicated in Figure 26, under North American conditions, in liquid TIDE® 
detergent, the ASP protease showed enhanced cleaning performance as compared to 
PURAFECT®, RELASE™ and OPTIMASE™ proteases under the same conditions. Under 
Japanese conditions, in Detergent Comp. Ill powder (0.66 g/l), ASP showed enhanced or 
the same cleaning performance as compared to PURAFECT®, RELASE™ and 
OPTIMASE™ proteases under the same conditions (See, Figure 27). Under European 
conditions, in ARIEL® REGULAR powder detergent, the ASP protease showed enhanced 
cleaning performance as compared to PURAFECT®, RELASE™ and OPTIMASE™ 
proteases under the same conditions {See, Figure 28). In both tests, ASP and 
OPTIMASE™ provided results that were 2 to 10 times the absorbance at 405 nm as 
compared to PURAFECT® and RELASE™. Under Japanese conditions, in PURE CLEAN 
powder detergent (See, Figure 29), the ASP protease showed enhanced and comparative 
cleaning performance as compared to PURAFECT®, RELASE™ and OPTIMASE™ 
proteases under the same conditions. Under North American conditions, in Detergent 
Composition III powder detergent (See, Figure 30), the ASP protease showed enhanced or 
comparative cleaning performance as compared to PURAFECT®, RELASE™ and 
OPTIMASE™ proteases under the same conditions. 

EXAMPLE 32 
Liquid Fabric Cleaning Compositions 

This Example provides liquid fabric cleaning compositions that find use in 
conjunction with the present invention. These compositions are contemplated to find 
particular utility under Japanese machine wash conditions, as well as for applications 
involving cleaning of fine and/or delicate fabrics. Table 32-1 provides a suitable 
composition. However, it is not intended that the present invention be limited to this specific 
formulation, as many other formulations find use with the present invention. 



WO 2005/052146 



-298- 



PCT/US2004/039066 



Table 32-1. Liquid Fabric Cleaning Composition 


Component 


Amount (%) 


AE2.5S 


2.16 


AS 


3.30 


N-Cocoyl N-methyl glucamine 


1.10 


Nonionic surfactant 


10.00 


Citric acid 


0.40 


Fatty acid 


0.70 


Base 


0.85 


Monoethanolamine 


1.01 


1 ,2-Propanediol 


1.92 


EtOH 


0.24 


HXS 


2.09 


Protease.sup.1 


0.01 


Amylase 


0.06 


Minors/inerts to 100% 





EXAMPLE 33 
Liquid Dishwashing Compositions 

This Example provides liquid dishwashing compositions that find use in conjunction 
with the present invention. These compositions are contemplated to find particular utility 
under Japanese dish washing conditions. Table 33-1 provide suitable compositions. 
However, it is not intended that the present invention be limited to this specific formulation, 
as many other formulations find use with the present invention. 



Table 33-1. Liquid Dishwashing Compositions 


Component 


A 


B 


AE1.4S 


24.69 


24.69 


N-cocoyl N-methyl glucamine 


3.09 


3.09 


Amine oxide 


2.06 


2.06 


Betaine 


2.06 


2.06 


Nonionic surfactant 


4.11 


4.11 
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Hydrotrope 


A A~7 

4.47 


4.47 


Magnesium 


0.49 


/\ /in 

0.49 


Ethanol 


7.2 


7.2 


Lemon case 


n zir 




Geraniol/BHT 




0.60/0.02 


Amylase 


0.03 


0.005 


Protease 


0.01 


0.43 


Balance to 100% 







EXAMPLE 34 
Liquid Fabric Cleaning Compositions 

The proteases of the present invention find particular use in cleaning compositions. 
For example, it is contemplated that liquid fabric cleaning composition of particular utility 
under Japanese machine wash conditions be prepared in accordance with the invention. In 
some preferred embodiments, these compositions comprise the following components 
shown in Table 34-1. 



Table 34-1. Liquid Fabric Cleaning Composition 


Component 


Amount (%) 


AE2.5S 


15.00 


AS 


5.50 


N-Cocoyl N-methyl glucamine 


5.50 


Nonionic surfactant 


4.50 


Citric acid 


3.00 


Fatty acid 


5.00 


Base 


0.97 


Monoethanolamine 


5.10 


1 ,2-Propanediol 


7.44 


EtOH 


5.50 


HXS 


1.90 


Boric Acid 


3.50 


Ethoxylatedtetraethylenepentaimine 


3.00 


SRP 


0.30 
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0.069 


AmvlflSS 


n hp 


Cellulase 


0.08 


Lipase 


0.18 


Brightener 


0.10 


Minors/inerts to 100% 





EXAMPLE 35 
Granular Fabric Cleaning Compositions 

In this Example, various granular fabric cleaning compositions that find use with 
the present invention are provided. The following Tables provide suitable compositions. 
However, it is not intended that the present invention be limited to these specific 
formulations, as many other formulations find use with the present invention. 



Table 35-1. Granular Fabric Cleaning Compositions 


i Component 


Formu 


ations 


A 


B 


C 


D 


Proteasel 


0.10 


0.20 


0.03 


0.05 


Protease2 






0.2 


0.15 


C13 linear alkyl benzene sulfonate 


22.00 


22.00 


22.00 


22.00 


Phosphate (as sodium tripolyphosphate) 


23.00 


23.00 


23.00 


23.00 


Sodium carbonate 


23.00 


23.00 


23.00 


23.00 


Sodium silicate 


14.00 


14.00 


14.00 


14.00 


Zeolite 


8.20 


8.20 


8.20 


8.20 


Chelant (diethylaenetriamine-petaacetic 
acid) 


0.40 


0.40 


0.40 


0.40 


Sodium sulfate 


5.50 


5.50 


5.50 


5.50 


Water 


Balance to 100% 



Table 35-2. Granular Fabric Cleaning Compositions 



Component 


Formulations 




A 


B 


C 


D 


Proteasel 


0.10 


0.20 


0.30 


0.05 


Protease2 






0,2 


0.1 


C12 alkyl benzene sulfonate 


12.00 


12.00 


12.00 


12.00 


Zeolite A (1-10 micrometer) 


26.00 


26.00 


26.00 


26.00 


C12-C14 secondary (2,3) alkyl sulfate, Na 
salt 


5.00 


5.00 


5.00 


5.00 


Sodium citrate 


5.00 


5.00 


5.00 


5.00 


Optical brightenere 


0.10 


0.10 


0.10 


0.10 
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Sodium sulfate 


17.00 17.00 17.00 17.00 


Fillers, water, minors 


Balance to 100% 



The following laundry detergent compositions are contemplated to provide particular 
utility under European machine wash conditions. 



Table 35-3. Granular Fabric Cleaning Compositions 



Component 


Formulations 


A 


B 


C 


LAS 


7.0 


5.61 


4.76 


TAS 






1.57 


C45AS 


6.0 


2.24 


3.89 


C25E25 


1.0 


0.76 


1.18 


C45E7 






2.0 


C25E3 


4.0 


5.5 




QAS 


0.8 


2.0 


2.0 


STPP 








Zeolite 


25.0 


19.5 


19.5 


Citric acid 


2.0 


2.0 


2.0 


NaSKS-6 


8.0 


10.6 


10.6 


Carbonate I 


8.0 


10.0 


8.6 


MA/AA 


1.0 


2.6 


1.6 


CMC I 


0.5 


0.4 


0.4 


PB4 ! 




12.7 




Percarbonate 






19.7 


TAED 




3.1 


5.0 


Citrate 


7.0 






DT rivlr 


0.25 


0.2 


0.3 


HEDP 


0.3 


0.3 


0.3 


QEA1 


0.9 


1.2 


1.0 


Proteasel 


0.02 


0.05 


0.035 


Lipase 


0.15 


0.25 


0.15 


Cellulase 


0.28 


0.28 


0.28 


Amylase 


0.4 


0.7 


0.3 


PVPI/PVNO 


0.4 




0.1 


Photoactivated 
bleach (ppm) 


15 ppm 


27 ppm 


27 ppm 


Briqhtener 1 


0.08 


0.19 


0.19 


Briqhtener 2 




0.04 


0.04 


Perfume 


0.3 


0.3 


0.3 


Effervescent 
granules (malic 
acid 40%, 
sodium 
bicarbonate 
40%, sodium 


15 


15 


5 
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carbonate 
20%) 








Silicon 
antifoam 


0.5 


2.4 


2.4 


Minors/inerts to 
100% 


Balance to 100% 



EXAMPLE 36 
Detergent Formulations 

s In this Example, various detergent formulations which find use with ASP and/or ASP 

variants are provided. It is understood that the test methods provided in this section must 
be used to determine the respective values of the parameters of the present invention. 

In the exemplified detergent compositions, the enzymes levels are expressed by 
pure enzyme by weight of the total composition and unless otherwise specified, the 

10 detergent ingredients are expressed by weight of the total compositions. The abbreviated 
component identifications therein have the following meanings: 



LAS 

TAS 
CxyAS 

CxyEz 
CxyAEzS 



Nonionic 



QAS 
Silicate 
Metasilicate 
Zeolite A 

SKS-6 

Sulfate 

STPP 

MA/AA 



Table 36-1. Definitions Used in this Example 

Sodium linear Cn. 13 alkyl benzene sulfonate. 
Sodium tallow alkyl sulphate. 
Sodium C-| X - C-|y alkyl sulfate. 

c 1x " c 1y predominantly linear primary alcohol condensed 
with an average of z moles of ethylene oxide. 
c 1x " c 1y sodium alkyl sulfate condensed with an average of 
z moles of ethylene oxide. Added molecule name in the 
examples. 

Mixed ethoxylated/propoxylated fatty alcohol e.g. Plurafac 
LF404 being an alcohol with an average degree of 
ethoxylation of 3.8 and an average degree of propoxylation of 
4.5. 

R2.N+(CH 3 )2(C 2 H40H) with R 2 = C 12 -C 14 . 
Amorphous Sodium Silicate (Si0 2 :Na 2 0 ratio = 1.6-3:2:1). 
Sodium metasilicate (Si0 2 :Na 2 0 ratio = 1.0). 
Hydrated Aluminosilicate of formula Nai 2 (A10 2 Si0 2 )i 2 . 
27H 2 0 

Crystalline layered silicate of formula 5-Na 2 Si 2 05 

Anhydrous sodium sulphate. 
Sodium Tripolyphosphate. 

Random copolymer of 4:1 acrylate/maleate, average 
molecular weight about 70,000-80,000. 
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AA 

Polycarboxylate 



BB1 
BB2 
PB1 
PB4 

Percarbonate 

TAED 
NOBS 
DTPA 
HEDP 
DETPMP 

EDDS 

Diamine 

DETBCHD 

PAAC 
Paraffin 

Paraffin Sulfonate 

Aldose oxidase 

Galactose oxidase 
Protease 



Amylase 



Lipase 



Sodium polyacrylate polymer of average molecular weight 
4,500. 

Copolymer comprising mixture of carboxylated monomers 
such as acrylate, maleate and methyacrylate with a MW 
ranging between 2,000-80,000 such as Sokolan commercially 
available from BASF, being a copolymer of acrylic acid, 
MW4,500. 

3-(3,4-Dihydroisoquinolinium)propane sulfonate 

1 -(3,4-dihydroisoquinoHnium)-decane-2-sulfate 

Sodium perborate monohydrate. 

Sodium perborate tetrahydrate of nominal formula 

NaB0 3 .4H 2 0. 

Sodium percarbonate of nominal formula 2Na2C03-3H202 . 
Tetraacetyl ethylene diamine. 

Nonanoyloxybenzene sulfonate in the form of the sodium salt. 

Diethylene triamine pentaacetic acid. 

1 ,1 -hydroxyethane diphosphonic acid. 

Diethyltriamine penta (methylene) phosphonate, marketed by 

Monsanto under the Trade name Dequest 2060; 

Ethylenediamine-N,N'-disuccinic acid, (S,S) isomer in the form 

of its sodium salt 

Dimethyl aminopropyl amine; 1,6-hezane diamine; 1,3- 

propane diamine; 2-methyl-1,5-pentane diamine; 1,3- 

pentanediamine; 1 -methyl-diaminopropane. 

5, 12- diethyM, 5,8,1 2-tetraazabicyclo [6,6,2] hexadecane, 

dichloride, Mn(ll) salt 

Pentaamine acetate cobalt(lll) salt. 

Paraffin oil sold under the tradename Winog 70 by 

Wintershall. 

A Paraffin oil or wax in which some of the hydrogen atoms 
have been replaced by sulfonate groups. 
Oxidase enzyme sold under the tradename Aldose Oxidase 
by Novozymes A/S 
Galactose oxidase from Sigma 
Proteolytic enzyme sold under the tradename Savinase, 
Alcalase, Everlase by Novo Nordisk A/S, and the following 
from Genencor International, Inc: "Protease A" described in 
US RE 34,606 in Figures 1A, 1B, and 7, and at column 11, 
lines 11-37; "Protease B w described in US5,955,340 and 
US5,700,676 in Figures 1A, 1B and 5, as well as Table 1; and 
"Protease C" described in US6,31 2,936 and US 6,482,628 in 
Figures 1-3 [SEQ ID 3], and at column 25, line 12, "Protease 
D" being the variant 

1 01 G/1 03A/1 041/1 59D/232V/236H/245R/248D/252K (BPN' 
numbering) described in WO 99/20723. 

Amylolytic enzyme sold under the tradename Purafect® Ox 
Am described in WO 94/18314, WO96/05295 sold by 

Genencor; Natalase®, Termamyl®, Fungamyl® and 
Duramyl®, all available from Novozymes A/S. 
Lipolytic enzyme sold under the tradename Upolase Lipolase 
Ultra by Novozymes A/S and Lipomax by Gist-Brocades. 
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Cellulase 

Pectin Lyase 
PVP 

PVNO 



PVPVI 

Brightener 1 
Silicone antifoam 



Suds Suppressor 

SRP1 
PEGX 
PVP K60® 
Jeffamine ® ED-2001 
Isachem ® AS 
MME PEG (2000) 

DC3225C 

TEPAE 

BTA 

Betaine 

Sugar 

CFAA 

TPKFA 
Clay 

PH 



Cellulytic enzyme sold under the tradename Carezyme, 
Celluzyme and/or Endolase by Novozymes A/S. 
Pectaway® an d Pectawash® available from Novozymes A/S. 
Polyvinylpyrrolidone with an average molecular weight of 
60,000 

Polyvinylpyridine-N-Oxide, with an average molecular weight 
of 50,000. 

Copolymer of vinylimidazole and vinylpyrrolidone, with an 
average molecular weight of 20,000. 
Disodium 4,4'-bis(2-sulphostyryl)biphenyl. 
Polydimethylsiloxane foam controller with siloxane- 
oxyalkylene copolymer as dispersing agent with a ratio of said 
foam controller to said dispersing agent of 10:1 to 100:1. 
12% Silicone/silica, 18% stearyl alcohol,70% starch in 
granular form. 

Anionically end capped poly esters. 
Polyethylene glycol, of a molecular weight of x. 
Vinylpyrrolidone homopolymer (average MW 160,000) 
Capped polyethylene glycol from Huntsman 
A branched alcohol alkyl sulphate from Enichem 
Monomethyl ether polyethylene glycol (MW 2000) from Fluka 
Chemie AG. 

Silicone suds suppresser, mixture of Silicone oil and Silica 
from Dow Corning. 

Tetreaethylenepentaamine ethoxylate. 

Benzotriazole. 

(CH 3 ) 3 N + CH 2 COCr 

Industry grade D-glucose or food grade sugar 
C 12 -C 14 alkyl N-methyl glucamide 

Ci 2 -C 14 topped whole cut fatty acids. 
A hydrated aluminumu silicate in a general formula 
AI 2 0 3 Si02 xH 2 0. Types: Kaolinite, montmorillonite, atapulgite, 
illite, bentonite, halloysite. 

Measured as a 1% solution in distilled water at 20°C. 



The following Table (Table 36-2) provides liquid laundry detergent compositions that are 
prepared. 

5 
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Table 36-2. Liquid Laundry Detergent Compositions 



Component 


1 


■ f 
II 


ill 


IV 


V 


LAS 


24.0 


32.0 i 


6.0 


8.0 


6.0 








q a 


■1 i A 
II. u 


C A 

O.U 


C 8 -C 10 amido propyl dimethyl amine 


O A 


o a 


o a 


O A 


1 .u 


C 12 -Ci4 alkyl dimethyl amine oxide 




■ 


■ 


m 


2.0 


Ci2~Ci5 AS 


- 


■ 


17.0 


7.0 


8.0 


CFAA 


- 


5.0 


4.0 


4.0 


3.0 


C12-C14 Fatty alcohol ethoxylate 


12.0 


6.0 


1.0 


1.0 


1.0 


C12-C18 Fatty acid 


3.0 


- 


4.0 


4.0 


3.0 


Citric acid (anhydrous) 


6.0 


5.0 


3.0 


3.0 I 


2.0 


DETPMP 


- 


- 


1.0 


1.0 


0.5 


Monoethanolamine 


# 


# 


5.0 


5.0 


2.0 


Sodium hydroxide 


- 


- 


2.5 


1.0 


1.5 


Propanediol 


12.7 


14.5 


13.1 


10. 


8.0 


Ethanol 


1.8 


2.4 


4.7 


5.4 


1.0 


DTPA 


0.5 


0.4 


0.3 


0.4 


0.5 


Pectin Lyase 


- 


- 


- 


0.005 


- 


Amylase 


0.001 


0.002 


- 




- 


Cellulase 


- 


- 


0.0002 


- 


0.0001 


Lipase 


0.1 


- 


0.1 


- 


0.1 


ASP 


0.05 


0.3 


0.08 


0.5 


0.2 


Protease A 


- 


- • 


- 


- 


0.1 


Aldose Oxidase 


- 


- 


0.3 


- 


0.003 


DETBCHD 


- 


- 


0.02 


0.01 


- 


SRP1 


0.5 


0.5 




0.3 


0.3 


Boric acid 


2.4 


2.4 


2.8 


2.8 


2.4 


Sodium xylene sulfonate 






3.0 






DC 3225C 


1.0 


1.0 


1.0 


1.0 


1.0 


2-butyl-octanol 


0.03 


0.04 


0.04 


0.03 


0.03 


Brightener 1 


0.12 


0.10 


0.18 


0.08 


0.10 


Balance to 1 00% perfume / dye and/or water 



for (II). 
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The following Table (36-3) provides hand dish liquid detergent compositions that are 
prepared. 



Table 36-3. Hand Dish Liquid Detergent Compositions 


Component 


I 


II 


Ill 


IV 


V 


VI 


C12-C15AE1.eS 


30.0 


28.0 


25.0 




15.0 


10.0 


LAS 


- 






5.0 


15.0 


12.0 


Paraffin Sulfonate 


- 




- 


20.0 


- 


- 


C10-C18 Alkyi Dimethyl 
Amine Oxide 


5.0 


3.0 


7.0 


- 


** 


- 


Betaine 


3.0 




1.0 


3.0 


1.0 




C12 poly-OH fatty acid 
amide 


■ 






3.0 




1.0 


C14 poly-OH fatty acid 
amide 




1.5 










C11E9 


2.0 


■ 


4.0 




— 


20.0 


DTPA 






m 




0.2 


■ 


Tri-sodium Citrate dihydrate 


0.25 






0.7 






Diamine 


1.0 


5.0 


7.0 


1.0 


5.0 


7.0 


MgCI 2 


0.25 






1.0 






ASP 


0.02 


0.01 


0.03 


0.01 


0.02 


0.05 


Protease A 




0.01 










Amylase 


0.001 






0.002 




0.001 


Aldose Oxidase 


0.03 




0.02 




0.05 




Sodium Cumene 
Sulphonate 








2.0 


1.5 


3.0 


PAAC 


0.01 


0.01 ! 


0.02 








DETBCHD 








0.01 


0.02 


0.01 


Balance to 1 00% perfume / dye and/or water 



The pH of these compositions is about 8 to about 1 1 
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Table 36-4 provides liquid automatic dishwashing detergent compositions that are 
prepared. 



Table 36-4. Liquid Automatic Dishwashing Detergent Compositions 





1 


II 


III 

ill 


IV 

1 V 


v 

V 


STPP 


16 


16 


18 


16 


16 


Potassium Sulfate 




10 


8 




10 


1,2 propanediol 


6.0 


0.5 


2.0 


6.0 


0.5 


Boric Acid 


4.0 


3.0 


3.0 


4.0 


3.0 


CaCI 2 dihydrate 


0.04 


0.04 


0.04 


0.04 


0.04 


Nonionic 


0.5 


0.5 


0.5 


0.5 


0.5 


ASP 


0.1 


0.03 


0.05 


0.03 


0.06 


Protease B 








0.01 




Amylase 


0.02 




0.02 


0.02 




Aldose Oxidase 




0.15 


0.02 




0.01 


Galactose Oxidase 






0.01 




0.01 


PAAC 


0.01 






0.01 




DETBCHD 




0.01 






0.01 



Balance to 100% perfume / dye and/or water 



Table 36-5 provides laundry compositions which may be prepared in the form of 
granules or tablets that are prepared. 



Table 36-5. Laundry Compositions 
Base Product I II III IV V 



C 14 -C 15 AS or TAS 


8.0 


5.0 


3.0 


3.0 


3.0 


LAS 


8.0 




8.0 




7.0 


O12-C15AE3S 


0.5 


2.0 


1.0 






C,2-Ci 5 Es or E3 


2.0 




5.0 


2.0 


2.0 


QAS 








1.0 


1.0 


Zeolite A 


20.0 


18.0 


11.0 




10.0 


SKS-6(dryadd) 






9.0 






MA/AA 


2.0 


2.0 


2.0 






AA 










4.0 


3Na Citrate 2H 2 0 




.2.0 








Citric Acid (Anhydrous) 


2.0 




1.5 


2.0 




DTPA 


0.2 


0.2 








EDDS 






0.5 


0.1 




HEDP 






0.2 


0.1 




PB1 


3.0 


4.8 






4.0 


Percarbonate 






3.8 


5.2 




NOBS 


1.9 










NACA OBS 






2.0 






TAED 


0.5 


2.0 


2.0 


5.0 


1.00 


BB1 


0.06 




0.34 




0.14 


BB2 




0.14 




0.20 




Anhydrous Na Carbonate 


15.0 


18.0 


8.0 


15.0 


15.0 
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Table 36-5. Laundry Compositions 



Base Product 


1 


II 


III 


IV 


V 


Sulfate 


5.0 


12.0 


2.0 


17.0 


3.0 


Silicate 




1.0 






8.0 


ASP 


0.03 


0.05 


1.0 


0.06 


0.1. 


Protease B 




0.01 








Protease C 








0.01 




Lipase 




0.008 








Amylase 


0.001 








0.001 


Cellulase 




0.0014 








Pectin Lyase 


0.001 


0.001 


0.001 


0.001 


0.001 


Aldose Oxidase 


0.03 




0.05 






PAAC 




0.01 






0.05 



Balance to 100% Moisture and/or Minors* 

* Perfume, Dye, Brightener / SRP1 / Na Carboxymethylcellulose/ Photobleach / MgS0 4 / 
PVPVI/ Suds suppressor /High Molecular PEG/Clay. 

Table 36-6 provides liquid laundry detergent formulations which are prepared. 



Table 36-6. Liquid Laundry Detergent Formulations 



Component 


1 


1 


II 


III 


IV 


V 


LAS 


11.5 


11.5 


9.0 




4.0 




C12-C15AE2.85S 






3.0 


18.0 




16.0 


C14-C15E 2.5 s 


11.5 


11.5 


3.0 




16.0 




C 12"Cl3Eg 






3.0 


2.0 


2.0 


1.0 


C 12-C13E 7 


3.2 


3.2 










CFAA 








5.0 




3.0 


TPKFA 


2.0 


2.0 




2.0 


0.5 


2.0 


Citric Acid 


3.2 


3.2 


0.5 


1.2 


2.0 


1.2 


(Anhydrous) 














Ca formate 


0.1 


0.1 


0.06 


0.1 






Na formate 


0.5 


0.5 


0.06 


0.1 


0.05 


0.05 


Na Culmene 


4.0 


4.0 


1.0 


3.0 


1.2 




Sulfonate 














Borate 


0.6 


0.6 




3.0 


2.0 


3.0 


Na Hydroxide 


6.0 


6.0 


2.0 


3.5 


4.0 


3.0 


Ethanol 


2.0 


2.0 


1.0 


4.0 


4.0 


3.0 


1,2 Propanediol 


3.0 


3.0 


2.0 


8.0 


8.0 


5.0 


Mono- 


3.0 


3.0 


1.5 


1.0 


2.5 


1.0 


ethanolamine 














TEPAE 


2.0 


2.0 




1.0 


1.0 


1.0 


ASP 


0.03 


0.05 


0.01 


0.03 


0.08 


0.02 


Protease A 






0.01 








Lipase 








0.002 






Amylase 










0.002 




Cellulase 












0.0001 


Pectin Lyase 


0.005 


0.005 










Aldose Oxidase 


0.05 






0.05 




0.02 


Galactose oxidase 




0.04 
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Table 36-6. Liquid Laundry Detergent Formulations 



Component 


1 


1 


II 
II 


III 
III 


IV 

1 V 


V 


PAAC 


0.03 


0.03 


0.02 








DETBCHD 








0.02 


0.01 




SRP 1 


0.2 


0.2 




0.1 






DTPA 








0.3 






PVNO 








0.3 




0.2 


Brightener 1 


0.2 


0.2 


0.07 


0.1 






Silicone antifoam 


0.04 


0.04 


0.02 


0.1 


0.1 


0.1 


Balance to 100% perfume/dye and/or water 











Table 36-7 provides compact high density dishwashing detergents that are prepared. 



Table 


36-7. Compact High Density 


Dishwashing 


Detergents 




Component 


1 


II 


III 


IV 


V 


VI 


STPP 




45.0 


45.0 


- 


- 


40.0 


3Na Citrate 2H 2 0 


17.0 






50.0 


40.2 


- 


Na Carbonate 


17.5 


14.0 


20.0 




8.0 


33.6 


Bicarbonate 








26.0 






Silicate 


15.0 


15.0 


8.0 




25.0 


3.6 


Metasilicate 


2.5 


4.5 


4.5 








PB1 






4.5 








PB4 








5.0 






Percarbonate 












4.8 


BB1 




0.1 


0.1 




0.5 




BB2 


0.2 


0.05 




0.1 




0.6 


Nonionic 


2.0 


1.5 


1.5 


3.0 


1.9 


5.9 


HEDP 


1.0 












DETPMP 


0.6 












PAAC 


0.03 


0.05 


0.02 








Paraffin 


0.5 


0.4 


0.4 


0.6 






ASP 


0.072 


0.053 


0.053 


0.026 


0.059 


0.01 


Protease B 












0.01 


Amylase 


0.012 




0.012 




0.021 


0.006 


Lipase 




0.001 




0.005 






Pectin Lyase 


0.001 


0.001 


0.001 








Aldose Oxidase 


0.05 


0.05 


0.03 


0.01 


0.02 


0.01 


BTA 


0.3 


0.2 


0.2 


0.3 


0.3 


0.3 


Polycarboxylate 


6.0 








4.0 


0.9 


Perfume 


0.2 


0.1 


0.1 


0.2 


0.2 


0.2 


Balance to 100% 


Moisture and/or Minors* 











♦Brightener / Dye / SRP1 / Na Carboxymethylcellulose/ Photobleach / MgS0 4 / PVPVI/ Suds 
suppressor /High Molecular PEG/Clay. 

The pH of the above compositions is from about 9.6 to about 1 1 .3. 
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Table 36-8 provides tablet detergent compositions of the present invention that are 
prepared by compression of a granular dishwashing detergent composition at a pressure of 
13KN/cm 2 using a standard 12 head rotary press: 



Table 36-8. Tablet Detergent Compositions 



Component 


1 


II 


III 


IV 


V 


VI 


VII 


VIII 


STPP 


- 


48.8 


44.7 


38.2 


- 


42.4 


46.1 


36.0 


3Na Citrate 2H 2 0 


20.0 


- 


• 


- 


35.9 


- 


- 


- 


Na Carbonate 


20.0 


5.0 


14.0 


15.4 


8.0 


23.0 


20.0 


28.0 


Silicate 


15.0 


14.8 


15.0 


12.6 


23.4 


2.9 


4.3 


4.2 


Lipase 


0.001 


- 


0.01 


- 


0.02 




- 




Protease B 


0.01 


- 


- 


- 




- 




_ 


Protease C 


_ 


_ 


_ 


- 


_ 


0.01 


_ 




ASP 


0.01 


0.08 


0.05 


0.04 


0.052 


0.023 


0.023 


0.029 


Amylase 


0.012 


0.012 


0.012 




0.015 


„ 


0.017 


0.002 


Pectin Lvase 


0.005 






0.002 










Aldose Oxidase 




0.03 


_ 


0.02 


0.02 


_ 


0.03 




PB1 




_ 


3.8 




7.8 


_ 




8.5 


Percarbonate 


6.0 






6.0 




5.0 






BB1 


0.2 




0.5 




0.3 


0.2 






BB2 




0.2 




0.5 






0.1 


0.2 


Nonionic 


1.5 


2.0 


2.0 


2.2 


1.0 


4.2 


4.0 


6.5 


PAAC 


0.01 


0.01 


0.02 












DETBCHD 








0.02 


0.02 








TAED 












2.1 




1.6 


HEDP 


1.0 






0.9 




0.4 


0.2 




DETPMP 


0.7 
















Paraffin 


0.4 


0.5 


0.5 


0.5 






0.5 




BTA 


0.2 


0.3 


0.3 


0.3 


0.3 


0.3 


0.3 




Polycarboxylate 


4.0 








4.9 


0.6 


0.8 




PEG 400-30,000 












2.0 




2.0 


Glycerol 












0.4 




0.5 


Perfume 








0.05 


0.2 


0.2 


0.2 


0.2 


Balance to 100% 


Moisture 


and/or Minors* 













*Brightener/SRP1 / Na Carboxymethylcellulose/ Photobleach / MgS0 4 / PVPVI/ Suds 

suppressor /High Molecular PEG/Clay. 

The pH of these compositions is from about 10 to about 11.5. 

The tablet weight of these compositions is from about 20 grams to about 30 grams. 

Table 36-9 provides liquid hard surface cleaning detergent compositions of the 
present invention that are prepared. 



Table 36-9. Liquid Hard Surface Cleaning Detergent Compositions 
Component I II III IV V VI VII 
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Table 36-9. Liquid Hard Surface Cleaning Detergent Compositions 



Component 


I 


II 


III 


IV 


V 


VI 


VII 


C9-C11E5 


2.4 


1.9 


2.5 


2.5 


2.5 


2.4 


2.5 


C12-C14E5 


3.6 


2.9 


2.5 


2.5 


2.5 


3.6 


2.5 


C7-C9E6 


- 


- 


- 


- 


8.0 


- 


- 


C12-C14E21 


1.0 


0.8 


4.0 


2.0 


2.0 


1.0 


2.0 


LAS 


- 


- 


- 


0.8 


0.8 


-• 


0.8 


Sodium culmene sulfonate 


1.5 


2.6 


- 


1.5 


1.5 


1.5 


1.5 


Isachem ® AS 


0.6 


0.6 


- 


- 


- 


0.6 


- 


Na 2 C0 3 


0.6 


0.13 


0.6 


0.1 


0.2 


0.6 


0.2 


3Na Citrate 2H z O 


0.5 


0.56 


0.5 


0.6 


0.75 


0.5 


0.75 


NaOH 


0.3 


0.33 


0.3 


0.3 


0.5 


0.3 


0.5 


Fatty Acid 


0.6 


0.13 


0.6 


0.1 


0.4 


0.6 


0.4 


2-butyl octanol 


0.3 


0.3 


- 


0.3 


0.3 


0.3 


0.3 


PEG DME-2000® 


0.4 




0.3 


0.35 


0.5 


- 


- 


PVP 


0.3 


0.4 


0.6 


0.3 


0.5 


- 


- 


MME PEG (2000) ® 


- 


- 


- 


- 


- 


0.5 


0.5 


Jeffamine®ED-2001 


- 


0.4 


- 


- 


0.5 


- 


- 


PAAC 


- 


- 


- 


0.03 


0.03 


0.03 


- 


DETBCHD 


0.03 


0.05 


0.05 


- 


- 


- 


- 


ASP 


0.07 


0.05 


0.08 


0.03 


0.06 


0.01 


0.04 


Protease B 












0.01 




Amylase 


0.12 


0.01 


0.01 




0.02 




0.01 


Lipase 




0.001 




0.005 




0.005 




Pectin Lyase 


0.001 




0.001 








0.00J 


PB1 




4.6 




3.8 








Aldose Oxidase 


0.05 




0.03 




0.02 


0.02 


0.05 



Balance to 100% perfume / dye and/or water 

The pH of these compositions is from about 7.4 to about 9.5. 



EXAMPLE 37 
Animal Feed Comprising ASP 

The present invention also provides animal feed compositions 



comprising ASP 
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and/pr ASP variants. In this Example, one such feed, suitable for poultry is provided. 
However, it is not intended that the present invention be limited to this specific formulation, 
as the proteases of the present invention find use with numerous other feed formulations. It 
is further intended that the feeds of the present invention be suitable for administration to 
5 any animal, including but not limited to livestock (e.g., cattle, pigs, sheep, etc.), as well as 
companion animals {e.g., dogs, cats, horses, rodents, etc.). The following Table provides a 
formulation for a mash, namely a maize-based starter feed suitable for administration to 
turkey poults up to 3 weeks of age. 



Table 37-1. Animal Feed Composition 


Ingredient Amount 


(wt. %) 


Maize 


36.65 


Soybean meal (45.6% CP) 


55.4 


Animal- vegetable fat 


3.2 


Dicalcium phosphate 


2.3 


Limestone 


1.5 


Mineral premix 


0.3 


Vitamin premix 


0.3 


Sodium chloride 


0.15 


DL methionine 


0.2 



10 

In some embodiments, this feed formulation is supplemented with various 
concentrations of the protease(s) of the present invention (e.g., 2,000 units/kg, 4,000 
units/kg and 6,000 units/kg). 

15 All patents and publications mentioned in the specification are indicative of the levels 

of those skilled in the art to which the invention pertains. All patents and publications are 
herein incorporated by reference to the same extent as if each individual publication was 
specifically and individually indicated to be incorporated by reference. However, the citation 
of any publication is not to be construed as an admission that it is prior art with respect to 

20 the present invention. 

Having described the preferred embodiments of the present invention, it will appear 
to those ordinarily skilled in the art that various modifications may be made to the disclosed 
embodiments, and that such modifications are intended to be within the scope of the 
present invention. 
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Those of skill in the art readily appreciate that the present invention is well adapted 
to carry out the objects and obtain the ends and advantages mentioned, as well as those 
inherent therein. The compositions and methods described herein are representative of 
preferred embodiments, are exemplary, and are not intended as limitations on the scope of 

5 the invention. It is readily apparent to one skilled in the art that varying substitutions and 
modifications may be made to the invention disclosed herein without departing from the 
scope and spirit of the invention. 

The invention illustratively described herein suitably may be practiced in the absence 
of any element or elements, limitation or limitations which is not specifically disclosed herein. 

10 The terms and expressions which have been employed are used as terms of description 
and not of limitation, and there is no intention that in the use of such terms and expressions 
of excluding any equivalents of the features shown and described or portions thereof, but it 
is recognized that various modifications are possible within the scope of the invention 
claimed. Thus, it should be understood that although the present invention has been 

15 specifically disclosed by preferred embodiments and optional features, modification and 
variation of the concepts herein disclosed may be resorted to by those skilled in the art, and 
that such modifications and variations are considered to be within the scope of this invention 
as defined by the appended claims. 

The invention has been described broadly and generically herein. Each of the 

20 narrower species and subgeneric groupings falling within the generic disclosure also form 
part of the invention. This includes the generic description of the invention with a proviso or 
negative limitation removing any subject matter from the genus, regardless of whether or not 
the excised material is specifically recited herein. 



25 



